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CHAPTER  1 


THE  PATH  FOLLOHIHG  PROBLEM 

1.  Introduction 

The  problen  addressed  in  this  dissertation  can  be  described  as 
follows.  Assume  that  we  are  given, 

H  :  Q  c  Rn+1  -*■  Rn 
2 

H  €  C  ,  l.e. ,  H  is  twice  continuously  differentiable 
H(y°)  -  0 
detfH' (y°) J  A  0. 

Under  these  conditions,  we  know,  by  the  inpliclt  function  theorem,  that 

for  some  neighborhood  H  of  y* ,  C  n  H  is  a  smooth  one-dimensional 

manifold,  where  C  is  the  maximal  connected  subset  of  H  *(y^)  ■  {y  e  Q 

H(y)  -  0}  containing  y®.  The  problem  of  interest  may  then  be  stated 

as  the  numerical  task  of  tracing  C  in  some  specified  direction, 

0 

starting  at  y  and  continuing  until  some  point  of  interest  is 
encountered  or  until  C  ceases  to  be  a  smooth  one-dimensional 
manifold.  He  shall  be  mainly  concerned  with  the  case  in  which  n  is 
large  and  H'(y)  is  sparse. 
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2.  Applications  of  Path-following 

Applications  of  the  path-following  problem  fall  roughly  into  two 
categories:  parametric  problems  and  homotopy  problems.  By  “parametric 
problem8“  we  mean  those  problems  in  which  motion  along  the  curve  may 
represent  the  variation  of  some  system  under  study,  as  some  naturally 
occurring  parameter  in  the  system  is  changed.  For  these  problems,  the 
entire  curve  is  normally  of  interest  and  numerical  methods  for  tracing 
it  are  usually  required  to  do  so  very  closely.  Into  this  category  fall 
the  so-called  continuation  problems  (Wacker  (1978)).  Homotopy  problems 
are  generally  directed  towards  the  solution  of  some  nonlinear  system 
where  this  solution  is  represented  by  a  specific  point  of  C. 

Algorithms  for  tracing  C  in  this  case  are  concerned  only  with 
reaching  this  point  of  interest;  C  is  useful  only  as  a  guide  to  get 
this  solution  and  hence  it  may  be  quite  loosely  followed  until  we  get 
close  to  the  desired  point. 

Example  1:  Parametric  Structural  Problem  (Rhelnboldt  (1981)) 

Figure  (1.2.1)  represents  a  simple  plane  structure  in  which  two 
Identical  rods  with  longitudinal  elastic  modulus  y  are  pin-jointed  to 
the  supports  A  and  B  and  together  at  C.  The  vertical  displacement, 
x,  under  load  p  satisfies  the  equation 

H(x,p)  -  y[  I  ■  1+ -  — j  ~  l]  (h-x)  -  p  -  0  , 

1  -  (h-x)* 


where  h  is  the  vertical  distance  of  C  from  Afi  under  zero  load. 

The  structural  engineer,  who  wishes  to  study  the  behavior  of  the  system 
under  varying  load  p,  is  Interested  in  all  values  (x,p)  satisfying 
H(x,p)  -  0  and  p  >.  0.  Hence  he  is  faced  with  the  problem  of  tracing 
C  ■  {(x,p)  :  H(x,p)  ■  0}  from  the  point  (0,0)  in  the  direction  of 
Increasing  p.  The  curve  C  is  shown  in  Figure  (1.2.2),  where  the  load 
p  ■  p  is  an  asymptote  of  C  and  represents  a  buckling  point  of  the 
system.  For  larger  and  more  complex  structures,  the  system  of  equations 
is  larger  and  more  complex  but  the  underlying  path  tracing  problem  is 
the  same.  0 

Example  2:  Homotopy  problem 

By  Brouwer's  Fixed  Point  Theorem,  we  know  that  f ( •)  has  a  fixed 
point  in  [1/e,  e]  where  f  :  Rn  ■*  Kn 

n 

f  (x)  -  exp[cos(  l  x,)]  for  i-1,  2,  ...,n. 

j-1  J 

We  can  locate  this  fixed  point  by  tracing  the  path  C  ■  {(x,t)  :  H(x,t) 

-  0}  where 

H(x,t)  ■  x  -  tf(x) 

from  the  point  (0,0),  starting  Initially  in  the  direction  of  increasing 
t  and  then  continuing  along  C  until  some  point  (x,l)  is 
encountered.  Then  x  is  a  fixed  point  of  f(  •).  0 
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Note  that  homotopy  problems  sometimes  do  require  close  path  follow¬ 
ing.  In  Example  (2)  above,  the  path  C  possesses  very  high  curvature 
through  almost  its  entire  length  and  any  path-following  technique  that 
is  used  must  follow  it  very  closely  or  risk  the  danger  of  losing  it 
altogether.  Also  in  the  homotopy  problem  for  solving  for  all  solutions 
of  polynomial  systems  (see  Garcia  and  Zangwill  (1981),  Rosenberg  (1983)) 
several  separate  paths  are  followed  to  different  solutions;  each  path 
must  be  closely  followed  in  order  to  minimize  the  danger  of  slipping 
from  one  path  to  another. 


3.  Historical  Background 

The  path-following  problem  has  developed  historically  from  two 
completely  separate  directions.  In  one  direction  lies  the  classical 
continuation  problem  as  discussed  in  Ortega  and  Rheinboldt  (1970),  Smale 
(1976)  and  Wacker  (1978);  in  the  other  lie  the  piecewise-linear 
simplicial  techniques  as  discussed  in  Eaves  (1976)  and  Todd  (1976a). 

The  classical  continuation  technique  was  developed  as  a 
globalization  of  Newton's  method  and  was  used  for  problems  in  which  good 
starting  points  were  not  available.  Given  a  function  f  :  Rn  -*•  Rn  a 
homotopy  is  set  up  to  produce  a  smooth  path  from  the  available  starting 
point  x^.  This  path  is  followed  In  the  hope  that  it  will  lead  to  a 
solution  of  f(x)  ■  0.  Examples  of  possible  homotopies  are 
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H(x, t)  *  (l-t)f(x)  +  tA(x-x^)  A  €  Rn*n  (regularizising  homotopy) 
H(x,t)  ■  f(x)  -  (l-t)f(x^)  (defect  reducing  homotopy) 

The  classical  continuation  method  used  "t"  as  the  independent  variable 
so  that  the  algorithm  proceeded  along  the  following  lines: 

(i)  Choose  some  partition  0  ■  t^  <  t*  <  •••  <  t^  *  <  t^  ■  1  . 

(ii)  Solve  H(x,t*)  ■  0  by  Newton's  method  using  as  starting  point 

x*  *  where  x*  *  is  the  solution  of  H(x,t*  *)  *  0. 

Ic 

Note  that  x  solves  f(x)  ■  0.  As  we  shall  see  later,  this  algorithm 
is  a  special  —  and  inefficient  —  implementation  of  the  predictor- 
corrector  method.  It  breaks  down  if  t  does  not  increase  monotonically 
along  the  curve  C.  This  is  one  important  difference  between  the 
classical  continuation  method  and  more  recent  path-following 
techniques.  The  classical  method  relinquished  the  path-following  task 
if  the  curve  "turned  backwards”;  the  later  methods  continue  along  the 
curve  around  such  bends  by  using  "t"  as  a  dependent  variable,  choosing 
instead  the  arclength,  s,  as  the  independent  variable  as  introduced  by 
Haselgrove  (1961).  One  important  problem,  to  which  many  papers  have 
been  addressed  (e.g. ,  Wacker  et  al.  (1978),  Deuflhard  (1979),  Den  Heijer 
and  Rhelnboldt  (1981))  is  the  question  of  efficient  adaptive  choice  of 
the  partition  {t*}  as  the  algorithm  progresses.  Efforts  to  attack 
this  problem  were  directed  to  the  analysis  of  radii  of  convergence  of 
Newton's  method.  Leder  (1970)  presented  another  adaptive  technique  by 
reformulating  the  path-following  problem  as  an  optimization  problem: 


where 


min  IG(x,t)l 


G(x,t) 


H(x,t) 

m(l-t) 


m  4  0  ,  m  constant  . 


Adaptive  incrementing  of  "t"  was  achieved  by  a  steplength  control 
based  on  a  monotonicity  test  in  the  sense  of  Goldstein-Armijo  (see 
Armijo  (1966)  and  Goldstein  (1967)). 

The  slmplicial  techniques  originated  with  Scarf  (1967a)  and  were 
based  on  the  ideas  of  complementary  pivoting  as  presented  in  Lemke  and 
Howson  (1964).  The  relationship  between  these  techniques  and  homotopy 
methods  was  studied  by  Eaves  (1972)  and  Merrill  (1972).  For  an  exten¬ 
sive  bibliography  on  simplicial  techniques  see  Eaves  (1976),  Todd 
(1976a)  and  Allgower  and  Georg  (1980).  Scarf's  paper  waa  motivated  by 
the  search  for  a  fixed  point  of  a  mapping.  Kellog,  Li  and  Yorke  (1976), 
using  a  non-retraction  principle,  provided  a  constructive  proof  of 
Brouwer's  fixed-point  theorem  for  smooth  mappings,  and  thus  established 
a  link  between  differentiable  homotopy  techniques  and  simplicial 
methods.  This  led  to  a  revitalization  of  interest  in  the  continuation 
technique.  Later  followed  the  studies  on  differentiable  homotopies  by 
Chow  et  al.  (1978),  Garcia  and  Gould  (1978)  and  a  host  of  other 
publications  Including  the  extensive  numerical  results  of  Watson  (1981). 
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for  more  predictor-corrector  cycles  to  trace  C.  Increasing  the 
predictor  steplength  leads  to  more  work  being  needed  in  the  corrector 
phase,  and  we  may  also  obtain  corrector  sequences  which  fail 
completely.  An  efficient  implementation  of  a  predictor-corrector 
algorithm  requires  a  dynamic  resetting  of  the  current  steplength  and  an 
effective  strategy  for  handling  those  cases  in  which  the  corrector 
sequence  diverges.  Any  such  steplength  strategy  depends  in  turn  on  how 
well  we  expect  the  predictor  to  behave  and  how  powerful  the  corrector 
technique  is.  Ideally  we  would  like  to  allow  for  adaptive  choice  of 
predictor  and  corrector  techniques  and  corresponding  dynamic  reevalu- 
atlon  of  steplength  strategies,  provided  this  can  be  obtained 
economically. 

The  predictor-corrector  method  is  in  sharp  contrast  to  simpllcial 
path-following  techniques.  While  the  latter  is  not  conceptually  as 
simple  as  the  predictor-corrector  method,  once  a  triangulation  has  been 
chosen  implementation  is  a  relatively  easy  task.  The  strength  of  the 
predictor-corrector  method  versus  simpllcial  techniques  is  its  adaptive 
ability  to  move  rapidly  through  well-behaved  portions  of  C  by  using 
large  steplengths  and  to  slow  down  at  more  difficult  portions  of  C. 

It  is  the  attempt  to  exploit  this  capability  that  leads  to  difficulties 
in  implementation.  The  statement  of  L.F.  Shampine  (see  Watson  (1979c)) 
in  reference  to  numerical  techniques  for  the  solution  of  differential 
equations  (".  .  .  how  a  method  is  implemented  may  be  more  important  than 
the  method  itself")  is  clearly  applicable  to  the  predictor-corrector 
method.  It  should  be  noted  that  predictor-corrector  methods  and 
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(i) 


k+1  k  ,  .k  k  .  k.  k  n  ,  k ,  ,  .  .k  . 

z  *  y  +  X  v  where  H'(y  )  v  *  0,  lv  I  ■  1  and  X  is 

some  predetermined  step  length. 

„  0  k+1 

(ii)  Set  w  ■  z  and 

-1 

H(wj) 

0 


wJ+1  »  vr>  - 


H'(wj) 

,  k.T 
(v  ) 


(iii) 
where  the 
satisfied, 
and 


k+1  1 

y  is  taken  to  be  the  last  w  in  the  sequence  in  (ii) 


iteration  is  terminated  when  some  stopping  criteria  are 

k 
!i 


k+1  L  ,  k+1 

e.g. ,  y  may  be  taken  as  z4 


where  Bw  -  w  II  <  e 


»wJ  e  for  j  <  1,  for  some  specified  e  >  0  . 

The  Iteration  in  (ii)  represents  the  refinement  of  the  initial  estimate 

k+1 

z  in  the  hyperplane  through  z  perpendicular  to  the  tangent 
k+1 

direction  at  y  ,  see  Figure  (2.1.1). 

The  conceptual  simplicity  of  the  predictor-corrector  method  is 
misleading;  in  practice  an  efficient  implementation  is  a  quite  difficult 
process.  There  are  many  problems  which  may  arise.  At  the  heart  of 
these  problems  lies  the  decision,  at  each  predictor  step,  of  what  step- 
length  should  be  used  in  estimating  the  next  point  on  the  curve.  If  a 
small  steplength  is  used,  we  can  expect  the  next  Iterative  sequence  of 
corrector  steps  to  be  quickly  convergent.  However,  there  is  a  tradeoff 
involved  in  the  use  of  small  predictor  steps  since  this  incurs  the  need 
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CHAPTER  2 


THE  PREDICTOR-CORRECTOR  METHOD 

1.  Introduction 

Recall  the  path-following  problem  stated  in  Chapter  1.  Given  H  e 

C2,  H  :  Q  c  Rn+1  -*■  nn,  H(y°)  -  0  and  detlH'(y0)]  h  0  we  wish  to 

trace  C,  the  one-dimensional  manifold  containing  y^ ,  which  is 

contained  in  Hi(0)-{ye  :  H(y)  -  0}.  In  this  chapter  we 

shall  discuss  the  computational  considerations  involved  in  using  a 

predictor-corrector  method  to  trace  C.  Although  we  are  mainly 

concerned  with  the  specific  case  when  n  is  large  and  H'  («)  is 

sparse,  most  of  the  following  discussion  is  more  generally  applicable. 

The  predictor-corrector  algorithm  traces  C  by  moving  from  one 

point  in  C  to  another  by  the  following  conceptually  simple  process. 

0  1  k 

Assume  successive  points  y  ,  y  ,  . . . ,  y  in  C  have  already  been 
k+1 

located.  Then  y  is  obtained  by 
(i)  Predict 

k+1  k+1 

Obtain  z  as  an  approximation  to  y  by  using  some 

0  1  k 

extrapolation  technique  based  on  the  points  y  ,  y  ,  • • • »  y  and 

possible  other  information  at  these  points,  e.g. ,  H'(y^),  H'(y^) . 

H’(yk)  etc. 

(ii)  Correct 

k+1 

Refine  the  estimate  z  by  using  some  local  iterative 

scheme. 

For  example,  if  an  Euler  predictor  and  a  Newton  corrector  are  used 
k+1 

then  y  is  obtained  as  follows: 
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7.  The  Predictor-Corrector  Method 

Equation  (6.2)  has  essentially  two  separate  components.  The  first 
term  on  the  right-hand  side  propels  the  algorithm  along  the  tangent  of 
the  curve,  while  the  second  term  pushes  toward  the  curve  so  as  to 
decrease  the  error  Incurred  in  the  forward  motion.  The  predictor- 
corrector  method  Involves  an  explicit  Implementation  of  this  idea. 

Motion  along  the  curve  is  achieved  by  predictor-corrector  cycles  which 
first  predict  a  point  further  along  on  the  curve  and  then  use  a  local 
iterative  technique  to  correct  for  the  error  in  prediction.  Further 
details  are  given  in  Chapter  2. 

The  most  expensive  part  of  the  predictor-corrector  algorithm  is  the 
local  iterative  sequence  used  in  the  corrector  phase.  Traditionally, 
Newton '8  method  has  been  the  choice  for  the  iterative  technique,  but 
this  may  turn  out  to  be  an  expensive  overkill.  We  shall  turn  Instead  to 
the  use  of  quasi-Newton  methods.  These  are  discussed  in  Chapter  3  for 
the  specific  case  of  large  sparse  systems. 
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has  suggested  approximating  H' (y)  by  the  use  of  quasi-Newton  updates. 
Georg  (1981)  also  uses  quasi-Newton  updates,  but  first  replaces  (6.1) 
with  another  system  with  better  stability  properties.  For 
A  €  nIlx(n+l)f  Xet  t(A)  e  Rn  be  the  unique  vector  satisfying 

A 

At(A)  -  0,  lt(A)  I  -  1,  det  [  _]  >  0  . 

t(A)1 


Then  (6.1(a))  Is  replaced  by 


-  6  •  t(H' (y))  -  [H'(y)]+  H(y),  6  >  0  , 

j.  m  w  _1 

where  A  “  A1(AA1)  Is  the  Moore-Penrose  inverse  of  A.  (6.2) 

The  second  term  on  the  right-hand  side  of  (6.2)  Introduces  a  damping 
element  into  the  vector  field,  which  makes  for  a  more  stable  Integra¬ 
tion.  Choice  of  6  may  be  made  adaptively  over  different  parts  of  C 
with  6  large  where  the  curvature  Is  small  and  vice  versa. 

Tracing  C  by  the  use  of  formulation  (6.1)  or  (6.2)  is.  In 
general,  not  a  very  efficient  technique,  even  though  It  does  have  the 
convenient  property  of  being  able  to  make  use  a  readily  available 
software.  The  predictor-corrector  technique  which  is  the  main  focus  of 
this  research  Is  more  efficient  than  either  the  differential  equation 
approach  or  the  slmpllclal  continuation  technique. 
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6.  The  Davldenko  Differential  Equation 

Tracing  C  can  be  accomplished  by  first  converting  the  problem 
into  the  following  system  of  differential  equations  (Davidenko  (1953)): 


(.)  H'(y)g-0 

<">  <£'  '  1 


(c)  det 


H'  (y) 


(dl)T 

Ms' 


>  0 


(d)  y(0)  -  y 


(6.1) 


The  first  equation  is  derived  from  differentiation  of  H(y)  *  0,  the 
second  from  the  use  of  arclength,  s,  as  the  independent  variable  and  the 
third  equation  chooses  the  sign  of  dy/ds  to  obtain  a  consistent 
orientation.  The  system  (6.1)  can  now  be  Integrated  using  any  of  the 
efficient  and  available  computer  packages  for  solving  the  Initial  Value 
Problem.  Watson  (1981a)  has  demonstrated  the  effectiveness  of  this 
approach. 

Integration  of  (6.1)  is,  in  general,  a  quite  expensive  process. 
Obtaining  dy/ds  Involves  solving  the  linear  system  (6.1(a))  and, 
hence,  requires  the  calculation  of  H' (y)  at  each  step.  Schmidt  (1979) 
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Beginning  from  a  simplex  along  the  chain  of  simplices  which  define  the 
piecewise  linear  path,  G  *(0),  an  attempt  is  made  to  predict  a  simplex 
further  along  on  this  chain  by  the  use  of  polynomial  extrapolation. 
Usually  the  predicted  simplex  does  not  lie  in  the  chain  but  just  off 
it.  A  correction  technique,  based  on  topological  perturbations,  is  then 
used  to  eliminate  this  prediction  error  by  locating  a  simplex  on  the 
chain  close  to  the  predicted  simplex.  This  technique  has  been 
incorporated  into  the  Scout  Continuation  Package  which  has  been  used  in 
the  study  of  various  continuation  problems  (Peitgen  and  Prufer  (1979)). 

The  main  disadvantage  of  the  simpllcial  technique  is  its  inability 
to  handle  problems  of  large  dimensions.  The  number  of  simplices  that 
need  to  be  traversed  grows  rapidly  with  increasing  dimension.  In  Todd 
(1980a)  and  Salgal  (1981),  techniques  are  presented  for  alleviating  this 
problem  of  dimensionality  by  exploiting  structure  and  sparsity,  which 
may  be  encountered  in  large  systems.  When  separability  or  sparsity  is 
encountered  in  the  system  H( •)  and  certain  specific  triangulations  are 
used,  the  piecewise  linear  function  G( •)  is  linear  over  regions  which 
may  span  groups  of  adjacent  simplices.  With  the  use  of  appropriate  data 
structures,  the  simpllcial  pivot  required  in  moving  from  one  simplex  to 
another  within  these  pieces  of  linearity  can  be  reduced  to  a  trivial 
amount  of  work;  the  corresponding  function  evaluation  is  also  virtually 
eliminated.  While  these  techniques  have  done  a  lot  towards  expanding 
the  range  of  applicability  of  simpllcial  techniques,  much  work  still 
needs  to  be  done  before  simpllcial  techniques  can  be  used  as  a  general 
path-following  technique  for  large  systems. 
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Two  published  techniques  designed  to  attack  this  problem  of 
Inefficiency  in  slmplicial  continuation  methods  do  so  by  relaxing  the 
inflexibility  of  the  algorithm.  The  basic  standard  technique  is 
retained  at  points  of  high  curvature  so  as  to  take  advantage  of  its 
robust  properties.  In  other  regions  of  the  path,  where  high  curvature 
is  not  encountered  and  robustness  is  not  absolutely  necessary,  more 
flexible  techniques  are  used  to  move  rapidly  and  Inexpensively  along  the 
path. 

The  first  technique  is  the  “flex  slmplicial  algorithm"  of  Garcia 
and  Zangwill  (1980).  Instead  of  working  with  a  fixed  triangulation 
which  is  imposed  at  the  start  of  the  procedure,  the  algorithm  allows  for 
the  formation  of  simplices  of  varying  sizes  as  it  moves  along  the  path. 
Large  simplices  are  used  in  regions  where  the  path  is  well-behaved;  when 
high  curvature  is  encountered,  the  algorithm  switches  to  a  fixed  trian¬ 
gulation.  In  this  way  efficiency  and  robustness  are  adaptively  traded. 
The  present  computational  status  of  the  algorithm  is  unclear.  To  date, 
no  significant  computational  experience  has  been  reported.  Effective 
exploitation  of  the  basic  idea  of  the  method  may  still  require  much 
research  into  setting  up  efficient  decision  rules  for  the  formation  of 
simplices  along  the  path  as  the  algorithm  progresses. 

The  second  technique  is  the  "slmplicial  predictor-corrector  method" 
of  Saupe  (1982).  Here  a  fixed  triangulation  is  used  and  at  points  of 
high  curvature  the  standard  slmplicial  pivoting  algorithm  is  in  force. 

In  more  well-behaved  regions,  however,  the  algorithm  attempts  to  skip 
simplices  along  the  path  by  using  predictor-corrector  techniques. 
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Theorem  (5.2) 

n  |  1  - 

Let  H  :  Q  c  H  -*•  F  where  Q  -  D  x  T,  D  is  the  closure 
of  an  open  bounded  set  in  Rn  and  H  ^(0)  is  a  one-dimensional 
manifold  .  If  H(«)  satisfies  a  Lipschitz  condition  with  constant  K 
then  for  y  in  the  interior  of  any  simplex  of  the  triangulation  with 
simpllclal  grid  size  e 

IG’(y)  -  H'(y)l  <  Kne  . 

Proof:  See  Salgal  (1977).  □ 

Simplicial  techniques  which  employ  fine  triangulations  are  very 
expensive.  Each  simplex  encountered  along  the  piecewise  linear  path 
incurs  a  cost  of  one  function  evaluation  and  one  linear  programming 
pivot.  Decreasing  the  grid  size  results  in  more  slmplices  being 
encountered  and  this  leads  to  an  expensive  algorithm  if  the  path  is  very 
long.  On  the  other  hand  simpllclal  path-following  techniques  are  very 
robust  and  can  be  used  on  highly  nonlinear  problems  for  which  other 
path-following  methods  would  fail.  They  remain  unaffected  by  high 
curvature  of  the  path,  which  may  cause  other  methods  to  lose  the  correct 
sense  of  direction  or  to  cycle.  The  price  of  this  robustness,  though, 
is  very  high  if  the  underlying  path  is  very  long  and  it  has  to  be 
closely  followed. 
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developed  to  solve  for  fixed  points  of  nonlinear  systems.  Hence,  the 
main  concern  of  these  algorithms  was  to  get  to  an  endpoint  of  the  path 
rather  than  approximating  the  path  closely  throughout.  The  refining 
triangulations  of  Eaves  (1972)  are  designed  specifically  for  this  goal; 

H  ^(0)  is  very  loosely  approximated  when  we  are  far  away  from  the  point 
of  interest  and  very  closely  approximated  as  we  move  towards  the  level 
t  ■  1.  The  continuation  problem  has  somewhat  different  requirements; 
here,  it  may  be  necessary  to  follow  H  *(0)  very  closely  throughout,  as 
in  the  study  of  nonlinear  eigenvalue  problems  (Peitgen  and  Prufer 
(1979))  or  in  the  problem  of  finding  all  solutions  of  polynomial  systems 
of  equations.  A  fine  triangulation  must  be  used  throughout  so  that 
G  *(0)  is  close  to  H  ^(0).  The  following  results  relate  G  *(0)  and 
H  *(0).  The  slmplicial  grid  is  defined  as  the  length  of  the  largest 
edge  over  all  simplices  in  the  triangulation. 

Theorem  (5.1) 

Let  H  :  Q  c  -►  Rn  where  Q  ■  D  *  T  and  D  is  the  closure 

of  an  open  bounded  set  in  Rn.  For  any  6  >  0,  if  the  slmplicial 
grid  of  the  triangulation  is  small  enough  and  y  e  G  *(0),  then 

dist(y,  H  *(0))  =  minOy-z#  :  z  e  H  *(0)}  <  6  . 

Proof:  See  Garcia  and  Zangwill  (1981),  Chapter  12.  □ 
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Example  3  (Freudenthal  Triangulation) 

Denote 

N  =  {1,  2,  ....  n} 

7i  5  some  permutation  of  N 


,0  _ 


,n  *i 


Kj  =  {y  e  R  :-g-  is  an  integer  for  i  e  N}  , 

for  some  6  >  0 

K.(y°,  it)  =  the  simplex  spanned  by  {y^,  y1,  y11} 


where 


0  „0  i  1-1  -  7i(i)  c  is*, 

yeKj,  y  -  y  +  6  e  ,  for  1  <  i  <  n 


and 


-K 


for  k  -  j 
otherwise 


Kj  is  the  set  of  all  simplices  K^y^.n)  when  6-1.  We  can 
choose  other  values  for  6  to  scale  the  triangulation  and  it  can  be 
made  as  fine  as  we  want  by  taking  6  sufficiently  small.  □ 


The  8impllclal  technique  follows 


G-1(0) 


by  using  the  pivoting 


techniques  of  linear  programming.  The  principle  of  complementary 
pivoting  and  the  use  of  perturbatic  techniques  ensure  that  this  path  is 
well-defined.  Traditionally,  simpliclal  path-following  techniques  were 


with  many  curves  Cg  (for  e  in  some  region  Dq  <=  Kn)  most  of  which 
are,  by  Sard's  theorem,  smooth  one-dimensional  manifolds.  On  the  other 
hand,  the  set  of  numbers  which  can  be  stored  on  a  computer  is  countable 
and  hence  has  measure  zero;  in  such  a  situation,  it  is  not  clear  what 
the  value  of  a  probabilistic  statement  is. 

For  the  homotopy  problem,  it  is  useful  to  know,  a  priori,  whether 
the  curve  C  reaches  the  level  t  ■  1.  For  a  discussion  of  boundary- 
free  conditions  which  are  sufficient  to  ensure  that  this  will  occur,  and 
which  can  be  shown  to  be  true  in  some  cases,  see  Eaves  (1976)  and  Garcia 
and  Zangwill  (1981).  In  most  cases,  no  such  result  can  be  verified  and 
the  implementation  of  a  homotopy  method  needs  to  follow  the  advice  of 
Alexander  (1978b):  "Have  faith." 

5.  Simplicial  Path-Following  Methods 

Slmpliclal  pivoting  techniques  provide  an  effective  and  robust 
method  for  following  a  piecewise  linear  approximation  to  C.  Moreover, 
this  method  can  work  with  weaker  differentlality  conditions  on  the 
function  H(«);  in  fact,  all  that  is  needed  is  upper  semi-continuity  of 
H(  • ) .  The  first  step  of  the  simplicial  technique  is  to  impose  a  trian¬ 
gulation  on  Rn;  for  details  see  Eaves  (1976)  or  Todd  (1976a).  This 
divides  Rn  into  a  countable  number  of  simplices.  The  algorithm 
then  works  with  the  piecewise  linear  function  G(>),  instead  of  with 
H( •),  where  G  agrees  with  H  at  the  nodes  of  the  simplices  and  is 
linear  within  simplices. 
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Path  Existence 


It  Is  of  Interest  to  know  whether  we  can  make  any  statements,  prior 
to  attempting  to  trace  the  curve  numerically,  concerning  the  existence 
of  such  a  curve  and  Its  smoothness  and  boundedness  properties.  As  noted 
in  Section  (1),  all  that  we  can  guarantee  Is  that  since  det  [H'(y^)]  f 
0,  there  is  some  neighborhood  M  containing  y^  such  that  M  n  C  is  a 
smooth  one-d lmens ional  manifold.  However  if  we  proceed  along  the  curve 
outside  H  we  may  discover  that  it  ceases  to  be  a  one-dimensional 
manifold.  Sard's  theorem  provides  a  global  probabilistic  statement 
concerning  the  nature  of  C. 

Sard's  Theorem  (Sard  (1942)) 

q  n 

Let  G  s  D  c  R  -*  R  ,  where  D  is  the  closure  of  an  open  set, 
and  let  G  be  k  times  continuously  differentiable  where  k  >  1  + 
max{0,q-n}.  Then  for  almost  all  e 

rank  G'(y)  **  n  for  all  y  e  G  *(e)  =  {y  e  D  :  G(y)  ■  e}  .  Q 

This  result  tells  us  that  even  though  C  may  not  be  a  one¬ 
dimensional  manifold,  we  can  use  an  arbitrarily  small  perturbation,  e, 
so  that,  by  the  implicit  function  theorem,  Cg  ■  {y  :  H(y)  «  e}  is  a 
one-dimensional  manifold.  In  practice,  the  roundoff  error  encountered 
by  a  numerical  technique  for  tracing  C  automatically  provides  pertur¬ 
bations  in  the  problem,  so  we  can  expect,  in  some  sense,  to  be  dealing 


simpllcial  techniques  are  not  mutually  exclusive,  as  demonstrated  In 
Saupe  (1982). 

2.  Predictor  Considerations 

k+1 

The  techniques  used  to  obtain  the  Initial  estimate,  z  ,  of  the 
k+1 

next  point,  y  ,  in  C  can  vary  from  being  very  simple  to  quite 

sophisticated.  The  choice  of  predictor  depends  mainly  on  how  powerful 

we  can  expect  the  corrector  technique  to  be.  For  example,  if  Newton's 

method  is  used  for  correcting  then  simple  and  crude  predictors  can  work 

quite  well.  Simple  predicting  methods  involve  fewer  arithmetic  calcula 

tions  and  also  require  that  less  information  be  retained  from  previous 

points,  thus  incurring  a  lower  storage  cost. 

(i)  The  Elevator  Predictor  (Garcia  and  Zangwlll  (1981)) 

k+1  k 

This  is  a  simple  predictor  which  obtains  z  from  y  by 

moving  parallel  to  one  coordinate  axis.  If  v  satisfies 

1c  “k  — k  k  “k 

H'(y  )v  ■  0,  Iv  I  ■  1  and  v  is  some  approximation  to  v  then 


k+1 


where 


j  i 
j  -  1 

i  ■  argmax{ IvS}  , 


J3  ’ 

yj  +  Xfe 


(2.1 


(2.2 


and  X  is  some  predetermined  steplength.  If  It  Is  known,  a  priori, 
that  the  path  is  monotonic  in  one  coordinate,  say  y  ,  then  we  may 
choose  i  ■  p  throughout  Instead  of  using  (2.2),  and  thus  avoid  the 
need  to  approximate  v  .  For  example,  if  y  -  (x,t)  and  H^(x,t)  is 
nonsingular  at  all  points  of  C,  then  we  can  deduce  that  either  dt/ds  >  0 
or  dt/ds  <  0  for  all  points  in  C  (where  s  -  arclength).  In  such  a 
case  we  may  simply  set  i  *  n+1  (since  t  ■  y  t ^ ) .  Monotonic  curves 
arise  quite  often  in  practice;  an  important  case  Involves  certain 
homotopies  used  to  solve  polynomial  systems  of  equations  (Rosenberg 
(1983)). 

(ii)  Polynomial  and  Hermlte  Predictors 

k+1 

More  sophisticated  predictors  obtain  z  by  fitting  polynomials 

to  past  points  yk,  y^  *,...,  y^  of  C  and  derivatives  to  the  curve 

at  these  points.  In  particular,  Adams-Bashforth  predictors  (Shampine 

and  Gordon  (1975))  are  important  for  their  excellent  stability 

properties  and  their  capacity  to  make  use  of  approximations  of  the 

-k.  k 

tangent  directions  v  at  points  y  of  C.  Using  arclength,  s,  as 
the  Independent  variable  for  parameterization  of  C,  with  y(0)  ■  y  , 
we  have,  at  the  start  of  the  (k+1)  predictor  step  the  following 
approximations : 

y(Sj )  a  y^  ,  y'(Sj)  *  ^  for  j  -  0,  1,  ...,  k  (2.3) 

The  Adams-Bashforth  predictor  based  on  p  points  obtains  the  polynomial 
P_  1,(1)  which  satisfies 


p„  for  j  •  k-p+1,  k-p+2,  k  (2.4) 

P»K  J 

and  then  for  some  predetermined  steplength  -  s^+^  -  obtains 
k+1 

z  as 

Z**1  -  yk  +  /  k+1  F  (t)  dt  .  (2.5) 

s  P,K 

8k 

It  Is  easily  shown  that  if  a  constant  steplength  A  is  used  then  the 

k  k+1 

asymptotic  error  t  -  Iz  -  y(s^j)l  satisfies 

rk  -  0(AP+2)  .  (2.6) 


The  use  of  the  Adams-Bashforth  predictor  is  useful  in  predictor- 

corrector  methods  because  of  its  insensitivity  to  small  errors  in  the 

tangent  directions  at  y^ ,  0  <  j  <  k.  These  errors  arise  because  we  may 

try  to  estimate  the  tangent  directions  instead  of  directly  calculating 

0  1  k 

them  and  also  because  the  points  y  ,  y  ,  . . • .  y  are  not  exactly  in 

0  1  k 

C.  In  fact,  the  previously  estimated  points  y  ,  y  ,  •••,  y  may  be 
quite  loosely  arranged  around  C  since  a  strategy  of  loose  and 
inexpensive  path  following  may  be  employed  until  we  get  to  the  region  of 
C  that  is  of  primary  interest. 

3.  Corrector  Considerations 

The  usual  and  convenient  corrector  technique  Involves  the  use  of  a 

k+1 

locally  convergent  iterative  scheme  in  a  hyperplane  through  z  ,  that 


is  sufficiently  traversal  to  C.  As  in  the  example  of  section  (1),  we 

k  k+1 

may  choose  the  hyperplane  P  ■  {y  :  v  (y-z  )  -  0}.  In  this  case  we 

k+1 

use  an  Iterative  technique  to  obtain  y  as  the  solution  of 


G(y) 


H(y)  -j 

k,  k+lv 
v  (y-z  )-l 


0  . 


(3.1) 


If 

Note  that  since  rank  [H' (y  ) ] 


n  and  H* (yk)  vk 


0  we  have  that 


G'(yk) 


is  nonsingular.  Hence  using  a  continuity  argument  we  can  deduce  that 
k+1  k 

for  A^  -  s  -  s  small  enough,  there  exists  a  neighborhood  Q  such 
k  k+1 

that  y  ,  z  and  y(8^+. )  e  Q  and  G' (y)  is  nonsingular  for  y  €  Q. 
This  establishes  the  feasibility  of  using  Newton-type  iterations  in  the 
solution  of  (3.1). 

He  shall  be  concerned  mainly  with  the  use  of  Newton  and  quasi- 
Newton  methods  in  solving  (3.1)  for  y^j.  Each  corrector  step  will  be 
a  full  Newton  step,  i.e. ,  no  steplength  control  is  imposed  on  the  cor¬ 
rector  sequence.  The  basic  philosophy  of  the  predictor  corrector 

method,  as  Implemented  here,  is  that  if  full  Newton  steps  do  not  lead  to 

k+1 

convergence  of  the  corrector  sequence  then  it  probably  means  that  z 
is  too  far  away  from  C  and  hence  we  should  restart  the  predictor- 
corrector  cycle  with  a  shorter  steplength,  A^.  The  alternative  strategy 
of  using  a  steplength  control  to  make  it  more  likely  that  the  corrector 
sequence  will  converge  Introduces  certain  problems.  First,  if  we  use 
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large  steps  and  then  encourage  convergence  by  the  use  of  a  corrector 

steplength  control  strategy  then  It  become  more  likely  that  we  may 

converge  to  a  solution  of  (3.1)  other  than  the  one  we  are  seeking  (see 

k+l 

Figure  (2.3.1)).  Also  In  those  cases  in  which  a  sharp  turn  causes  z 
to  be  very  far  away  from  C  (Figure  (2.3.2))  or  even  leads  to  the 
infeasibility  of  (3.1)  (Figure  (2.3.3)),  then  more  work  is  expended 
before  the  sequence  is  eventually  relinquished  as  divergent.  For  the 
purpose  of  safe  and  effective  curve  following  it  seems  better  to  retain 
a  full-step  strategy  on  corrector  sequences;  steplength  controls  may 
increase  the  likelihood  of  convergence  but  not  necessarily  to  the 
desired  point  and  hence  make  it  more  likely  that  the  algorithm  may  run 
into  cycles  (Figure  (2.3.4))  or  switch  to  points  in  (y  :  H(y)  -  0}\C 
(Figure  (2.3.1)). 

In  the  classical  continuation  method  (Wacker  (1978)),  Newton's 
method  is  used  on  the  corrector  steps.  This  algorithm  was  known  as  the 
Global  Newton  Method  since  the  continuation  technique  was  introduced 
specifically  to  enlarge  the  domain  of  convergence  of  Newton's  method  by 
the  construction  of  a  homotopy.  The  use  of  quasi-Newton  methods  on  the 
corrector  steps,  while  mentioned  by  several  authors  (Schmidt  (1979), 
Deuflhard  (1979),  Rheinboldt  (1974)),  was  first  discussed  in  Georg 
(1981b).  Here  the  focus  was  on  small  problems,  for  which  Broyden's 
Update  proved  to  be  a  quite  effective  tool  for  maintaining  a  useful 
approximation  of  H'(y)  throughout  the  entire  length  of  C,  with  only 
an  occasional  need  to  re-evaluate  H'(y)  from  scratch. 


•*_  wv 

A-  t   S 
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Figure  2.3- 


Figure  2.3. 


For  large  sparse  systems  the  problem  is  somewhat  more  difficult. 

The  sparse  Broyden  method  may  be  used  as  in  the  case  for  small  systems. 
However,  as  discussed  in  Chapter  3,  this  incurs  a  considerable  storage 
cost  since  both  the  current  matrix  and  its  LU  factors  must  be  main¬ 
tained  explicitly.  An  alternative  strategy  is  to  use  the  direct  secant 
updates  of  LU  factors  which  will  be  discussed  in  Chapter  3.  Note, 
however,  that  since  these  updates  can  be  expected  to  behave  properly 
only  on  local  convergence  problems — because  of  their  inability  to  update 
pivoting  strategies — their  use  Introduces  the  need  to  recalculate  the 
Jacobian,  H'(y),  from  scratch  at  start  of  each  corrector  sequence. 

4.  Orientation 

At  the  start  of  each  predictor  step  we  need  to  establish  the 

V  »|f 

correct  orientation  of  the  curve  C,  i.e. ,  given  H'(y  )  w  -  0, 

— ]j£  —If 

lw  I  »  1  and  w  is  some  approximation  of  w  ,  then  the  decision 

k  k  k  1c 

has  to  be  made  of  whether  v  -  -w  or  v  ■  +w  ,  where  v  is  the 
direction  in  which  we  shall  proceed. 

We  assume  that  the  path  is  parameterized  by  arclength,  that  is 

C  ■  {y(s)  :  0  <  8  <  s> 

where  s  is  the  length  of  the  curve  up  to  the  point  of  interest.  Under 
the  assumption  that  rank  [H'(y(s))]  -  n  for  0  <  s  s,  we  have  the 
Basic  Differential  Equation  (Garcia  and  Zangwill  (1981))  which  describes 
C: 
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for  1  m  1 ^  2t  » • • i 


(4.1) 


rrfl 


where 


u±  -  q(-l)1  det[H^(y)]  for  i  -  1,  2 . irt-1  (4.2) 

and  q  -  +1  depending  on  original  orientation.  H^(y)  «  Rn*n  is 
obtained  by  eliminating  the  ith  column  of  H'(y).  We  choose  q  so 
that  (4.1)  is  satisfied  at  y(0)  -  y®,  and  then  It  remains  constant 
thereafter. 

We  know  by  Sard's  Theorem  that  rank  [H' (y(s))]  -  n  for  0  s  _<  s 
is  an  event  of  probability  one.  Hence  in  order  to  establish  the  correct 
orientation  we  can  choose  v  -  u/lul  where  u  is  given  by  (4.2). 
However,  in  practice,  interesting  problems  do  arise  in  which  the  curve 
C  does  not  have  the  full  rank  property  throughout.  In  such  cases  the 
path  C  may  intersect  other  curves  <=  {y  ;  H(y)  “0).  At  such  points 
of  intersection  —  known  as  bifurcation  points  —  we  have  det[A(s)]  ■  0 
where 


A(s) 


H’(y(s)) 

[dy/d8]T 


(4.3) 


A  bifurcation  point,  y(s),  is  characterized  as  odd  or  even  depending 
on  the  number  of  eigenvalues  of  A(s)  that  go  to  zero  (Crandall  and 
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Rabinowitz  (1971)).  If  the  predictor-corrector  algorithm  jumps  over  an 

odd  bifurcation  point  then  we  need  to  change  the  sign  of  q  if  we  wish 

to  continue  along  C.  Keeping  q  constant  will  cause  the  algorithm  to 

double  back  along  that  part  of  C  from  which  it  came. 

In  order  to  ensure  that  the  predictor-corrector  method  follows  the 

curve  safely,  we  monitor  the  orientation  as  given  by  (4.1),  and  we 

k+1 

employ  the  following  orientation  strategy.  The  point  y  is  taken  to 
be  an  acceptable  next  point  along  C  if 

,  k  k+1.  ,  a  ,  k  k+1*  .  .  //  ,\ 

arcos(v  »w  )  <  0  or  arcos(-v  »w  )  _<  0  (4.4) 

k+1  -k+1 

where  9  is  some  given  acute  angle  and  w  approximates  w  where 

w1^1  is  defined  by  H'(yk+1)  w1**1  -  0,  Hwk+1 1  -  1.  If  (4.4)  is 
satisfied,  then  let 


k+1  .  ,  k  k+1.  k+1 

v  -  slgn(v  »w  )  w 


(4.5) 


This  strategy,  along  with  (4.2),  results  in  a  safe  curve  following 
algorithm  and  can  be  used  to  detect  when  an  odd  bifurcation  has  been 
encountered. 


5.  Steplength  Strategy 

The  predictor  steplength  strategy  is  the  most  crucial  control 
problem  which  arises  in  an  Implementation  of  a  predictor-corrector 
algorithm.  An  over-conservative  steplength  strategy,  which  employs 
small  steps,  leads  to  successful  corrector  sequences  but  to  many 


predictor-corrector  cycles.  A  more  ambitious  steplength  strategy 
results  iu  longer  corrector  sequences  or  to  failed  corrector  sequences 
for  which  the  predictor-corrector  cycle  must  be  restarted  with  smaller 
steplength.  Denote  by  X  the  current  value  of  the  steplength  for  the 
kth  cycle.  One  basic  steplength  strategy  Is  the  following: 


Steplength  Strategy  (A): 


hj.1  U 

i)  X  «-  min{\  ,  aX  }  for  some  a  >  1. 
max 

k+1  k+1 

ii)  If  the  corrector  sequence  fails  then  set  X  X  /  p  for 

p  >  1,  and  restart  the  predictor-corrector  cycle. 


ill)  If  necessary  repeat  (ii)  until  the  corrector  sequence 

k+1 

converges  or  until  X  <  ^min» 

Maximum  and  minimum  steplengths  are  denoted  by  X  ,  X^^  The 
maximum  allowable  steplength  prevents  dangerously  large  steplengths 
being  taken  which  may  lead  to  loss  of  the  curve;  if  the  steplength 
required  for  convergence  falls  below  ^m^n>  then  the  algorithm  has 
failed.  Strategy  (A)  attempts  to  increase  the  steplength  at  the  start 
of  each  cycle.  Several  consecutive  successful  corrector  sequences  will 
cause  the  steplength  to  grow  at  an  exponential  rate.  A  more 
conservative  strategy  is: 


*  •  •  J  m  -m  •  -  m  •  m  ■ 


•  *-*■ 


Steplength  Strategy  (B): 


...  min{\  ,  a\k}  if  >  X^+*  or  k  ■  2  where  a  >  1 

1)  xk+1  *  ( 

K  otherwise 

ii)  Same  as  in  Strategy  (A). 

Hi)  Same  as  in  Strategy  (A). 

Strategies  (A)  and  (B)  are  very  simple  procedures  and  are  based  on 
the  idea  of  becoming  more  ambitious  when  things  appear  to  be  going  well 
and  more  conservative  when  difficulties  are  encountered.  More 
sophisticated  strategies  take  into  consideration  the  specific  predictor 
and  corrector  techniques  being  employed.  Most  such  strategies  attempt 
to  minimize  the  number  of  predictor-corrector  cycles  by  estimating,  at 
the  start  of  each  cycle,  the  maximum  possible  steplength  that  could  be 
taken  and  still  allow  convergence  of  the  next  corrector  sequence.  Den 
Heijer  and  Rheinboldt  (1981)  showed  that  while  a  finite  upper  bound  on 
the  radius  of  convergence  of  the  next  corrector  sequence  cannot  be 
derived  solely  from  Information  based  on  previous  corrector  Iterates  but 
requires  more  global  information,  we  can  still  use  the  accumulated 
Information  from  past  corrector  sequences  to  derive  a  useful  estimate  of 
this  radius.  Deuflhard  (1979)  and  Hackl  et  al.  (1980)  also  give 
steplength  strategies  for  the  case  in  which  Newton's  method  is  used  as 
the  corrector  technique.  These  methods  are  all  based  upon  the  use  of 
information  derived  from  the  local  behavior  of  the  algorithm  to  estimate 
global  constants  associated  with  the  convergence  of  Newton's  Method. 


When  quasi-Newton  —  rather  than  Newton  —  correctors  are  used,  it 
Ls  not  clear  how  applicable  the  above  techniques  are.  Moreover,  these 
:echniques  are  based  on  the  questionable  assumption  that  it  is  desirable 
[or  each  predictor  step  to  be  as  large  as  possible  while  still  maintain¬ 
ing  corrector  convergence.  They  do  not  take  into  consideration  the 
possibility  that  such  a  strategy  might  actually  lead  to  an  increase  in 
the  total  work  involved  in  traversing  C  by  increasing  the  number  of 
corrector  steps  needed  for  convergence.  In  our  Implementation  we  shall 
turn  to  the  more  heuristic  approach  of  Georg  (1981). 

Assume  that  the  predictor  is  an  Adams-Bashforth  extrapolation  based 
on  the  last  m  points.  Then 

lzk+1  -  yk+1l  -  0(a£)  where  p  -  nH-2,  -  sk+1  -  sfe  .  (5.1) 

Now  if  the  Jacobian  is  accurate  —  that  is,  calculated  either 
analytically  or  by  finite  differences  —  at  the  start  of  the  corrector 
step,  then  we  can  show  the  following  (Georg  (1981)): 

^k+l  ^  !h(w* ) S  *  ,  where  w^  is  as  defined  in  Section  (1)(5.2) 
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(5.5) 


Vi  5  "V  ■ 

ak+1  =  arcos(vk,  v1^1)  -  CKA^)  .  (5.6) 


By  monitoring  the  variables  defined  in  (5. 2-5. 6)  a  simple  and  effective 
heuristic  is  developed  as  follows.  The  user  of  the  algorithm  decides,  a 
priori,  what  would  be  ideal  values  for  each  of  these  variables  and  then 
chooses  the  steplength,  A,  to  make  it  likely  that  these  values  are 
attained.  For  example,  we  have  from  (5.2) 


JL_  ~  \  ]P 

k+1  Vi  • 


(5.7) 


Therefore,  in  order  to  obtain  an  ideal  value 
step  we  should  set 


Kideal 


on  the  (k+l)8t 


A 


k+1 


/ideal //p  . 


(5.8) 


Similarly  other  values  for  the  steplength  can  be  estimated  using  (5.3- 
5.6).  We  then  take  A,  .  to  be  the  minimum  of  all  these  estimates. 


6.  Estimation  of  Tangent  Directions 

The  use  of  Adams-Bashforth  predictors  requires  that  at  the  end  of 

k 

each  predictor-corrector  cycle  we  estimate  the  tangent  direction,  v  ,  at 
the  recently  located  point,  y  ,  of  C.  The  safest,  but  somewhat 
expensive,  technique  is  to  calculate  it  precisely  by  solving  for  the 
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S  =  the  orthogonal  projection  operator  Into  the  subspace  A 
=  span^,  e2>  ,  e^} 

\  =  8Pan{ek’  ek+l . en} 

X(A)  =  (l  <.  i  <  n  :  there  exists  a  v  e  A  such  that  4  0} 

for  any  subspace  A  c  Rn  (2.1.2) 

2.2.  Update  I 

Consider  the  updating  problem  encountered  at  each  step.  Ignoring 
superfluous  subscripts  In  this  section,  we  may  state  it  as  follows. 

There  is  some  current  approximation,  A,  to  the  Jacobian  matrix.  It  is 
available  in  the  form  of  the  LU  factors  where  A  =  LU.  The  last 
Newton  step  has  provided  new  information  which  we  will  use  to  update  the 
current  approximate  Jacobian.  We  wish  to  update  (L,U)  directly  to 
(L,  U)  where,  for  s,y  e  Rn,  we  require 

LUs  -  y  where  y  =  F(x+s)  -  F(x) 

L  Is  a  unit  lower  triangular  matrix  (2.2.1) 

U  is  an  upper  triangular  matrix  . 

The  system  (2.2.1)  may  be  rewritten  as  follows: 


tt * 


:ep.  In  Section  2  a  new  update  Is  described,  which  may  be  viewed  as  a 
merallzation  of  this  update  and  which  allows  both  L  and  U  to 
try.  Johnson  and  Austria  (1983)  presented  an  algorithm  for  full 
itrices  which  maintains  the  factors  L  1  and  U  explicitly  and 
idates  these  factors  directly  at  each  iteration.  In  Section  2  a  sparse 
irlant  of  this  update  is  presented.  In  Section  3  some  sparsity  results 
slating  the  sparsity  pattern  of  A  to  the  sparsity  patterns  of  L, 

*  and  U  are  examined.  In  Sections  4  and  5  local  convergence 
nalyses  of  the  two  updates  are  presented.  Section  6  concludes  with  a 
amparlson  of  the  two  updates  and  suggestions  are  made  for  overcoming 
heir  present  disadvantages. 


.  Updating  Techniques 
.1.  Rotation 

The  following  sparsity  notation,  which  generalizes  the  notation 
sed  in  Section  1,  is  convenient  and  will  be  used  throughout  the  rest  of 
he  chapter.  For  any  matrix-valued  function  B  :  Q  c  Kn  Hnxn 
ef  ine 

ZB  =  {v  e  Rn  :  ejV  -  0  for  all  j  such  that  e^B(x)ej  ■  0 

for  all  x  c  fi} 

zj  =  zj  n  (v  €  Rn  :  vA  -  0}  (2.1.1) 


zB  . 

{M 

£  RnXn 

T 

:  M  ei 

,  ZB 
e  Zi 

for 

i  ”  1,  2,  ...,  n} 

in 

{M 

c  KnXn 

T 

:  M  ei 

<zB 

for 

i  *  1,  2,  *..,  n }  . 
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suffers  disadvantages  not  associated  with  its  full  lower  dimensional 

3 

counterpart.  For  the  full  Broyden  update  the  0(n  )  operations 

involved  in  solving  for  the  Newton  step,  s  ,  at  each  stage  of  the  itera- 

2 

tion  (1.1.1)  can  be  reduced  to  0(n  )  operations  by  using  the  tech¬ 
niques  of  Gill  et  al.  (1974).  These  techniques  make  use  of  the  fact 

K.“Hl  \a 

that  for  the  full  Broyden  update  (A  -  A  )  is  a  rank  one  matrix. 

lc 

The  QR  factors  of  A  are  maintained  explicitly  and  are  updated  to  give 

k+1  k 

the  QR  factors  of  A  .  The  Jacobian  approximation  A  is  never 

actually  explicitly  represented  in  storage.  For  the  sparse  Broyden 

k+1  k 

update,  however,  (A  -  A  )  is  generally  of  rank  n  and  the 

techniques  of  Gill  et  al.  (1974)  are  not  applicable.  A  must  be 

k+1 

explicitly  maintained  so  that  it  can  be  updated  to  A  according  to 
(1.1.6).  Solution  of  (1.1.1)  for  the  Newton  step  s  ,  therefore,  incurs 
both  the  need  to  refactorize  the  new  matrix  A  at  each  step  and  the 
need  for  extra  storage  to  hold  the  factors  of  A  .  If  these  storage 
costs  and  the  cost  of  factorization,  relative  to  the  cost  of  function 
evaluations,  are  high  enough  then  other  updating  techniques  for  sparse 
problems  may  become  competitive  even  when  they  may  have  slower  conver¬ 
gence  rates  than  the  sparse  Broyden  method. 

It  is  assumed  for  the  rest  of  this  chapter  that  the  storage  require- 
ments  for  the  problem  allow  for  the  solution  of  the  Newton  step,  s  ,  in 
(1.1.1)  by  factorization  techniques.  We  focus  on  methods  which  maintain 
sparse  LU  factors  of  A  which  are  directly  updated  by  incorporating 
the  quasi-Newton  information  at  each  step.  Dennis  and  Marwil  (1982) 
Introduced  an  update  which  holds  L  fixed  and  updates  U  only  at  each 
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0 


Z^  ■  {v  €  Hn  :  ejv  -  0  for  all  j  such  e*  F'(x) 

for  all  x  €  Q} 


J 


and 


Z  -  {M  e  RnXn:  MTe1  £  Z±  for  i  -  1,  2 . n}  (1.1.4) 

Z^  represents  the  sparsity  pattern  of  the  i*"*1  row  of  F'(x),  while  Z 

k+1 

represents  the  sparsity  pattern  of  F'(x).  A  is  chosen  to  be  the 
solution  of  the  optimization  problem 


min{ IA-Akl  :  A  £  Q(yk,sk)  n  Z}  . 


(1.1.5) 


Let  SQ  be  the  orthogonal  projection  operator  into  the  subspace 
A  c  Rn.  Then  the  solution,  Ak+1,  of  (1.1.5)  may  be  written  explicitly 


as 


Ak+1  -  Ak  +  l  [<s  W  (S  l'k)  f  ef(,k- Aksk)  el(S  W 


i-1 


where 


for  a  ■  0 
a  i1  0 


(1.1.6) 


In  Broyden,  Dennis  and  More  (1973),  local  superlinear  convergence 
of  Broyden's  updating  technique  was  demonstrated;  later,  Marwil  (1978) 
showed  that  the  same  is  true  for  the  sparse  Broyden  technique.  In  an 
actual  computer  implementation,  however,  the  sparse  Broyden  method 


Schnabel  (1979))  —  adaptively  define  from  *  as  the  Iteration 

]j£ 

proceeds.  For  example,  Broyden's  method  (Broyden  (1965))  derives  A 

for  k  >  0  through  the  following  updating  technique.  After  completion 

k+1 

of  the  steps  in  (1.1.1),  A  is  obtained  as 

.k+1  .k  ,  ,  k  .k  kN,  k.T,,  k  k. 

A  =*  A  +  (y  -  A  s  )(s  )  /(s  *s  ) 

where  y^  *  F(x^+^)  -  F(x'lC)  •  (1.1.2) 

k+1 

A  is  chosen  to  solve  the  optimization  problem 
min  { II A-A^ II  :  A  e  Q(y^,s^)} 

where  Q(y^,s^)  “{Me  RnXn  :  Ms^  =  y^} 

and  B  •  II  represents  the  Frobenius  norm  .  (1.1.3) 

k+1  k 

A  is  thus  derived  from  A  by  incorporating  the  new  slope  informa- 

k  k+1  k+1  k  k 

tion  obtained  in  moving  from  x  to  x  .  "A  e  Q(y  ,s  )"  is  known 
as  the  quasi-Newton  condition. 

For  large  sparse  problems  the  update  (1.1.2)  is  inappropriate 
since,  in  general,  it  results  in  an  updated  Jacobian  approximation  which 
is  dense.  The  sparse  Broyden  update  (Broyden  (1971),  Schubert  (1970)) 
avoids  this  drawback  by  imposing  a  further  condition  on  the  optimization 
problem  (1.1.3)  so  that  sparsity  is  maintained.  Using  the  notation  of 
Dennis  and  Marwll  (1982),  one  obtains  the  sparse  Broyden  update  as 


follows: 


CHAPTER  3 


DIRECT  SECANT  UPDATES  OP  SPARSE  MATRIX  FACTORS 

1.  Introduction 

Consider  the  general  Newton-type  iterative  technique  used  to  solve 

for  a  zero  of  a  non-linear  system  of  equations.  Given  a  function 

F  :  Q  <=  Rn  -*•  Hn,  we  wish  to  locate  x*  e  Q  such  that  F(x*)  *  0.  In 

this  chapter  we  shall  be  concerned  mainly  with  the  local  convergence 

problem,  l.e.,  given  x^  and  as  initial  estimates  of  x*  and 

F'(x^),  respectively,  we  seek  to  iteratively  refine  this  estimate,  x^, 

until  it  is  within  some  prescribed  distance  e  >  0  from  x*.  The 
8 1 

(k+l)  step  of  the  iteration  takes  the  following  general  form: 

.  .k  k  k.  .k  _nxn 

solve  As  ■  -F(x  )  ,  A  e  R 

j  ..  k+1  k  ,  ,k  k  , ,  ,  , . 

and  set  x  ■  x  +  X  s  .  (1.1.1) 

lc  k 

A  is  chosen  as  some  approximation  to  the  Jacobian  F'(x  ).  Dennis  and 

More  (1974)  demonstrated  that,  in  algorithms  of  this  type,  local  super- 

Jc  lc 

linear  convergence  of  x  to  x*  requires  lim^^  X  *  1.  Hence  for 

lc  k 

x  close  to  x*  we  may  wish  to  take  X  •  1.  For  the  remainder  of 

]£ 

this  chapter,  we  shall  set  X  ■  1  for  all  k  >  0. 

There  are  many  variations  on  the  specification  of  A  .  Setting 
k  0 

A  ■  A  for  k  >  0  results  in  the  pseudo-Newton  method,  while 
k  k 

A  ■  F'(x  )  gives  Newton's  method.  Quasi-Newton  methods  —  also  known 
as  least-change  secant  methods  (Dennis  and  More  (1977)),  Dennis  and 


"a 


8.  Quaai-Bevton  Correctors 

Most  of  the  work  Involved  In  a  predictor-corrector  algorithm  Is 
Incurred  In  the  corrector  phase  of  the  algorithm.  As  a  corrector, 
Newton's  method  is  very  reliable  but  also  very  expensive.  In  our 
algorithm  we  attempt  to  reduce  this  expense  by  using  quasi-Newton 
correctors  instead.  Quasi-Newton  techniques  for  large  sparse  systems  do 
not,  in  general,  have  a  very  strong  reputation  (Thapa  (1981)).  However, 
there  are  certain  differences  between  the  corrector  convergence  problem 
and  the  general  root-finding  convergence  problem  which  suggest  that 
there  are  advantages  to  be  gained  from  the  use  of  quasi-Newton 
techniques,  rather  than  the  more  robust  and  expensive  Newton's  method, 
in  predictor-corrector  algorithms. 

First,  the  level  of  difficulty  of  each  corrector  problem  is  an  open 
choice;  it  can  be  varied  by  choosing  different  predictor  steplengths. 
Hence  less  robust  convergence  techniques  are  feasible.  Secondly,  it  may 
actually  be  better  for  the  algorithm  to  take  two  short  predictor  steps 
and  use  a  less  expensive  corrector  technique  than  to  take  one  large  step 
and  then  use  a  powerful  and  expensive  corrector  technique  to  solve  the 
resulting  difficult  corrector  problem.  Not  only  is  it  possible  that 
less  total  work  may  be  required,  but  also  from  the  point  of  view  of  safe 
path  following  the  second  strategy  has  the  disadvantage  that  it  is  more 
likely  to  lead  to  corrector  divergence  or,  even  worse,  to  convergence  to 
a  point  on  another  curve. 

In  the  next  chapter  we  examine  the  theoretical  aspects  of  quasi- 
Newton  methods  for  large  sparse  systems. 
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Figure  2.7.1 
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Hence  if  M  is  large,  dyn+j/d8  is  very  small  and  a  near-tangential 
approach  is  observed.  To  distinguish  between  this  case  and  the  case 
involving  singularity  of  f'(x),  we  calculate  the  condition  number  of 
f'(x^).  A  Newton-type  iteration  at  level  t  ■  1  is  implemented  only 
if  this  condition  number  er[f'(x^)}  is  less  than  a  for  some 
prescribed  a. 

Whatever  the  reason  for  tangential  or  near-tangential  approach,  it 
is  clear  that  if  C  runs  close  to  the  hyperplane  P  “  {(x,t)  :  t  ■  1} 
over  a  significant  distance,  then  it  is  necessary  to  trace  C  very 
carefully  in  this  region  if  we  want  to  obtain  a  good  estimate  of  the 
point  of  intersection  x  ■  C  n  p.  For  those  cases  in  which 
o[f'(x^)]  >  o,  we  refine  the  estimate  x^  by  restarting  with 
homotopy  (7.4)  and  then  carefully  tracing  C,  rather  than  using  an 
iterative  technique  at  level  t  ■  l. 

Another  problem  caused  by  tangential  approach  is  the  situation 
illustrated  in  Figure  (2.7.1).  Here  C  touches  P  tangentially  at 
x,  so  that  for  the  points  (x^,  t^),  (x^*1,  t^+*)  we  have  t^  <  1, 
t^+*  <1.  To  recognize  such  an  occurrence  in  practice  we  check  for 
changes  in  the  sign  of  dt/ds  for  t  close  to  1.  When  this  situation 
arises,  x  is  located  by  carefully  retracing  C  between  (x^ ,  t^) 
and  (x^+*,  t^+*)  using  smaller  predictor  steps. 


H(x,t)  -  f(x)  -  (1-t)  f(x°) 


(7.4) 


If  f'(x)  is  singular  then  this  strategy  is  not  appropriate  since  it 
is  quite  likely  that  the  final  Iteration  at  level  t  ■  1  will  diverge 
again.  Fortunately,  this  situation  can  be  recognized  by  the  algorithm 
in  practice.  If  f'(x)  is  singular  then  by  (4.2)  the  path  C,  obtained 
from  either  homotopy  (7.1)  or  (7.2),  approaches  the  hyperplane  {(x,t)  : 
t  *  1}  tangentially.  Tangential — or  near  tangential -approach  may 
arise,  though,  for  other  reasons.  For  example,  if  homotopy  (7.2)  is 
used  and  x^  is  very  far  away  from  x  so  that  [f^(x^)|  >  M  for 
1  <  i  n  and  M  >  0  very  large,  then 

^  -  lu(a)»  where  u(s)  "  C~tf ^^(s))]-1  f(x°)  :  l]  . 

But 

l[f,(x(s))]‘1  f(x)°« 

>  Iiru)]-1  f(x°)i  -  lUf'(x)]'1  -  [f*(x(s))]~1}  f(x°) i 

>  kjM-e  for  constants  k^  >  0,  e  >  0 

where  e  is  very  small  for  x(s)  close  to  x  , 

_>  kM  for  some  constant  k. 

Hence 

dyn+l  1  _ 1 _  ,  j_ 

ds  ,u(8)l  “  |{f'(x(s))]"1  f(x°)l “  “ 
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il)  As  the  hyperplane  {(x,t)  :  t  ■  1}  is  approached,  tighten  up 
the  error  tolerances  of  the  corrector  sequences  and  the 
predictor  steplength  control  heuristics  so  that  the  curve  Is 
traced  more  closely. 

i  11 

111)  Continue  the  algorithm  until  some  point  y  ■  (x  ,  t  )  is 

located,  where  t*  1.  Now  if  y(s)  =  (x(s),  (t(s))  ■  Q(s) 

is  the  most  recent  polynomial  predictor  which  was  used  to 
£ 

locate  z  (where  a  m  arclength  measured  along  C) ,  then  we 
solve  for  x(s)  where 

(xU),  1)  -  Q(8)  (7.3) 

If  Q  is  a  low-order  polynomial  we  can  solve  (7.3)  explicitly  for 

x(s);  otherwise,  we  obtain  an  estimate  of  x(s)  by  interpolation 
1  1-1 

between  y  and  y  and  then  use  the  one-dimensional  Newton's  method 

to  solve  the  polynomial  system  (7.3)  for  x(s). 

iv)  Now  taking  x^  -  x(s)  as  an  initial  estimate  of  x  where 
f(x)  -  0,  we  use  Newton's  method  or  a  quasi-Newton  method  to 
solve  f(x)  -  0. 

Divergence  of  the  final  Newton-type  iteration  at  level  t  ■  1,  in 
step  (iv)  above,  results  in  failure  of  th*s  basic  technique.  Such 
divergence  means  that  the  path  following  algorithm  did  not  serve  its 
purpose,  which  was  to  obtain  an  estimate  of  the  solution,  x,  of 
f(x)  -  0  which  is  within  the  domain  of  convergence  of  the  final 
iterative  technique.  One  possible  strategy,  in  this  situation.  Is  to 
restart  the  entire  path  following  algorithm  at  (x^,  0)  using  a  new 
homotopy,  e.g. , 
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For  updating  techniques  which  do  not  allow  good  approximations  to 
the  tangent  directions  by  use  of  this  method,  we  switch  to  either  direct 
calculation  of  v  or  the  use  of  a  simpler  predictor  which  does  not 
require  knowledge  of  the  tangent  directions. 

7.  Terminating  the  Predictor-Corrector  Algorithm 

The  task  of  tracing  C  may  be  carried  out  with  one  of  two 
purposes  in  mind:  either  all  of  C  is  of  interest  or  only  some 
specific  point  or  region  of  C  is.  If  the  first  reason  is  true,  then 
the  algorithm  will  be  relatively  conservative  with  short  predictor  steps 
and  tight  error  tolerances  on  the  corrector  steps  so  as  to  allow  for 
close  curve  following.  If,  on  the  other  hand,  the  algorithm  is  being 
used  only  to  get  to  some  endpoint  then  it  attempts  to  follow  C  as 

loosely  as  possible  without  losing  the  curve  until  the  region  of 

interest  is  encountered. 

The  homotopy  problem  is  an  example  of  the  case  in  which  we  are 
interested  only  in  a  specific  point  of  C.  For  example,  if 

H(x,t)  *  tf(x)  +  (1-t)  A(x-x^)  ,  A  €  RnXn  (7.1) 

H(x,t)  -  f(x)  -  (1-t)  f(x°)  (7.2) 

then  we  wish  to  locate  y  ■  (x,  t)  e  C  where  t  ■  1.  We  then 

have  x  as  a  solution  of  f(x)  -  0. 

The  basic  technique  for  locating  y  is  as  r  ova: 
i)  Begin  the  predictor-corrector  algorithm  at  (x^,0)  moving 
along  C  is  the  direction  of  increasing  t. 


kernel  of  H'(yK),  where  H’(jr  )  Is  calculated  analytically  or  by 
finite  differences.  In  the  case  of  quasi-Newton  correctors,  a  less 
expensive  technique  which  works  well  in  some  cases  (as  discussed  in 
Chapter  4)  is  to  update  the  currently  available  approximation  to  the 
Jacobian  so  that  it  is  correct  in  the  two-dimensional  subspace  spanned 
by  v  and  (y  -  z  )  (see  Figure  2.6.1).  Note  that  this  technique 
requires  two  extra  function  evaluations  at  the  end  of  the  corrector 
phase. 


Example: 

If  Broyden's  update  is  being  used,  and  the  current  approximation  to 
the  Jacobian  is  B,  then  let 


T  T 

8  8 

5  -  B  +  (yx  -  Bsp  +  (y,  -  Bs,)  2 


Vl 


2  T 


S2®2 


where 


*  k-1 
Sj  -  6v 


for  some  6  >  0 


2  ,,  k  k* 

s  -  6(y  -z  )  , 


for  some  6  >  0 


yx  *  H(yk  +  8^  -  H(yk)  , 
y2  -  H(yk  +  s2)  -  H(yk)  . 

Note  that  8j  *s2  ■  0  since  all  corrector  steps  are  perpendicular  to 
1c 

v  .  It  is  easily  seen  that  if  6  is  small  enough,  the  matrix  B  is 


effectively  correct  in  the  subspace  spanned  by  {s.,s7} 


Using  (2. 2. 2-2. 2. 4) ,  a  least  change  secant  update  is  used  to  determine 
the  rows  of  L,  U  sequentially.  Under  the  assumption  that  rows 
k  ■  1,  2,  . i-1  of  L,  U  have  already  been  determined,  then 
whose  components  are  the  same  as  those  of  the  iC^  rows  of  L, 

U,  is  chosen  as  the  solution  of 


min  l!$.  —  4> .  B  (2.2.5(a)) 

♦i 

subject  to 

♦i  w(i)  -  y±  (2.2.5(b)) 

e  Spanfz^,  Z^}  .  (2.2.5(c)) 


Condition  (2.2.5(c))  maintains  the  sparsity  patterns  of  L,U  while 
(2.2.5(b))  is  the  quasi-Newton  condition  for  the  ittl  row.  Note  that 
the  update  of  Dennis  and  Marwil  (1982)  is  obtained  if  we  impose  on  the 
optimization  problem  (2.2.5)  the  further  condition 


(2.2.6) 


which  holds  L  fixed.  The  next  theorem  gives  the  solution  of  (2.2.5); 
the  proof  is  deferred  until  Section  3. 


Theorem  2.1: 

Let  =  Span{Z^,  Z^}.  The  solution  of  (2.2.5)  may  be  written 


explicitly  as 


^^[(sVV  (S 


(y ±-*i  w(i))  (s  i(w(i))T 


(2.2.7) 


where  is  determined  iteratively,  according  to  (2.2.3),  by 


8 


n 


I 

<r-i-l 


U,  ,  s 
i-l,o  a 


if  j  -  i-1 


a) 


(i-1) 


otherwise 

for  i  ■  2,  3,  . . . ,  n  . 


(2.2.8) 

□ 


The  convergence  analysis  of  this  update  —  presented  in  Section  4 
—  seems  to  suggest  that  it  should  be  implemented  in  a  cautious  type  of 
algorithm  which  allows  for  periodic  restarts,  similar  to  that  used  by 
Dennis  and  Marwll  (1982). 

2.3.  Algorithm  I: 

(1)  Choose 

x°  €  *n  as  an  approximation  to  x* 


m  a  fixed  positive  Integer 


(2)  Evaluate  F(x^)  and  A^,  a  finite  difference  (or  analytic) 


approximation  to  F'(x^). 


(3)  Factorize  **  lP\P  using  a  threshold  pivoting 


strategy,  where  P°,  Q°  are  permuation  matrices. 


(4)  Solve 


Tk  k  o0  ,  k, 

L  w  *  -P  F(x  )  , 

k  k  k 
U  t  =  w  , 

k  „0.k 
8  ■  Q  t 


....  k+1  k  .  k 

(5)  x  ■  x  +  s 

k+1 

Evaluate  F(xK  ; 


If  stopping  criteria  are  met,  then  STOP. 


(6)  If  k  »  m-1,  set  x°  »  x^+*,  k  *  0  and  go  to  (2). 

k  k+1  k 

(7)  Set  y  »  F(x  )  -  F(x  )  and  update  the  rows  of  L,U 
according  to  (2.2.7). 


(8)  k  +  k  +  1;  go  to  (4). 


For  simplicity,  step  (4)  is  stated  under  the  assumption  that 

is  non-singular.  If  is  singular  after  updating,  the  entire 

0  k 

algorithm  is  restarted  with  x  as  the  current  value,  x  .  The 


threshold  pivoting  strategy  of  step  (3)  is  used  to  enhance  the  sparsity 


of  and  U^  by  relaxing  the  usual  stability  test;  for  further 

details  see  Duff  (1977).  As  with  the  update  presented  in  Dennis  and 
Marwil  (1982),  periodic  recalculation  of  the  Jacobian  matrix  seems  to  be 
necessary  to  ensure  convergence. 

2.4.  Update  II 

The  somewhat  unsatisfactory  theoretical  requirement  of  Update  I, 
that  periodic  restarts  are  necessary  after  some  fixed  number  of  steps, 
may  be  avoided  if  we  work  explicitly  with  L  *  instead  of  L.  We 
employ  a  sparse  version  of  the  update  in  Johnson  and  Austria  (1983). 
Denoting  N  -  L-1,  i.e. ,  NA  -  U,  where  N  is  a  unit  lower  triangular 
matrix  and  U  is  upper  triangular,  we  consider  the  problem  of  directly 
updating  (N,U)  to  (N,U)  such  that 

Ny  *  Us  .  (2.4.1) 

Condition  (2.4.1)  is  the  quasi-Newton  condition,  equivalent  to 
As  *  y.  This  problem  is  treated  in  Johnson  and  Austria  (1983)  where  it 
is  assumed  that  N,U  are  dense  triangular  matrices.  For  the  sparse 
case  we  choose  (N,U)  to  be  the  solution  of  the  following 
optimization  problem: 


min JN-N+U-UI 
subject  to 

Ny  -  Us 

—  N  —  TT 

N  €  Z  ,  U  e  Z 

*  1  for  i  ■  1,  2,  n  (2.4.2) 

The  next  theorem  gives  the  solution  of  (2.4.2);  the  proof  is  deferred 
until  Section  3. 


Theorem  2.4 

The  solution  of  (2.4.2)  is  given  by 


where 

v<« 


T1 

*1 

Xi 


\  -  vi  -  aJ(S  1  v(i))T  (S  1  v(1))  f  (S  1 


for  i  -  1, 

2,  . 

(  ~8 

'  <>-1 

*  3^2  * 

•••»  * 

•  • »  “ •  •  • 

•  '8n 

(UH. 

U12* 

uln) 

(S„, 

^12* 

...»  oln) 

<8ir 

•••  » 

Ni,i-1*  Uii’ 

...,  uln) 

for 

(5U. 

•  •  •  » 

*i,i-l»  ^ii* 

uln) 

for 

(i).T 
v  ) 


if  i  -  1 

if  i  >  1 


i  -  2,  3, 
1  -  2,  3, 


(2.4.3) 


(2.4.4) 


•  i«)  n 


•  •  • )  n 
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a  ■  Ny  -  Us 


£ 


\  -  Span{zJ,  zj} 


2.5.  Algorithm  II 


This  is  similar  Co  Algorithm  I,  except  we  may  take  m  ■  ®. 


(1)  Choose  x^  €  Fn  as  an  approximation  to  x*. 


k  «-  0. 


(2)  Evaluate  F(x^)  and  fP ,  a  finite  difference  (or  analytic) 


approximation  to  F'(x  ). 


(3)  Factorize  A°  as  N°P°A°  Q°  -  U°  using  a  threshold  pivoting 


0  0 

strategy,  where  P  and  Q  are  permutation  matrices. 


(4)  Solve  L^t^  -  -NkP°F(xk), 


k  Ok 
s  -  Q  t  . 


/c.  k+1  k  .  k  k+1.. 

(5)  x  -  x  +  s  ;  evaluate  F(x  ). 


If  stopping  criteria  are  met,  then  STOP. 


k  0  k+1  k 

(6)  Let  y  ■  Pu(F(xK  ;  -  F(x*))  and  update  the  rows  of  N  and 


U  according  to  Theorem  2.4. 


(7)  k  +  k+1;  go  to  (4). 


2.6.  Pivoting  Considerations 


We  first  note  that  Updates  I  and  II  are  stated  under  the  assump- 


0  0  0  0  0 

tlon  that  the  pivoting  strategy  is  the  identity,  i.e.,  P  A  Q  ■  L  U 


where  P^  *  *  I.  It  should  be  clear  that  if  P^  and  are 


nontrivial,  then  we  can  permute  the  Independent  and  dependent  variables 


(St 


so  that  we  obtain  a  system  with  the  trivial  pivoting  strategy,  I,  and  to 
which  we  can  then  apply  the  Updates  I  and  II  as  stated  above.  This  will 
not  be  done  explicitly  here  since  it  merely  amounts  to  an  exercise  in 
notation. 

Updates  I  and  II  attempt  to  go  directly  from  an  approximation 
(L^U*)  —  or  (N^U*)  —  of  the  factors  of  F'(x^)  to  an  approximation 
(L*c+^,Ulc+1)  —  or  (N^c+\u*c+^)  —  of  the  factors  of  F'Cx^**).  In 

essence  this  requires  assumptions  on  the  continuity  of  the  factors  of 
F'(x)  and  on  the  persistence  of  the  pivoting  strategy  as  we  move  from 
one  matrix  approximation  to  another.  The  following  two  theorems 
establish  the  validity  of  this  process. 

Theorem  2.6.1 

Let  A  :  Rn  ♦  RnXn,  and  suppose  that  for  some  x°  e  Kn,  A(x°) 
is  nonsingular,  and  that  there  exist  eQ  >  0  and  y^  >  0  such  that 

|e*[A(x)  -  A(y)]ej|  £  y^lx-yl  , 

for  i,j  -  1,  2,  ...,  n  and  for  all  x,y  t  M(x^,Cq)  ■  {z  €  Fn  :  lz-x®ll 
<  Eq>.  If  the  LU  decomposition  without  pivoting  exists  at  x* , 

A(x^)  ■  L(x^)U(x^),  then  there  exists  e  >  0  such  that  decomposition 
without  pivoting  exists  at  all  x  e  H(x^,e).  Furthermore,  there  exist 
constants  Cg.dp  *  0  such  that 

IL(x)  -  L(x°) I  <  c0«x-x°l  and  IU(x)  -  U(x°)  I  <  dglx-x0! 
for  all  x  €  ■<x^,e).  0 
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Proof:  See  Dennis  and  Marwil  (1982) 


Theorem  2.6.2 

Let  A  :  Fn  -►  Rnxn  be  continuous  and  nonsingular  at  x*. 

For  any  threshold  pivoting  strategy,  there  exists  ry^,  >  0  such  that  if 
£  M(x*,  TJ.J,)  and  if  (P^,Q^)  is  a  pivot  sequence  for  which 
P^A(x^)Q^  has  an  LU  decomposition  without  further  pivoting,  P^A(x^)Q^ 
■  L(x^)  U(x^) ,  then  P^A(x*)Q^  can  be  factored  without  further 
pivoting.  0 

Proof:  Similar  to  Dennis  and  Marwil  (1982),  Corollary  3.5. 

Note  that  by  the  Banach  Perturbation  Lemma  (stated  below),  which 
establishes  the  continuity  of  L  *  as  a  function  of  L,  we  obtain  as  a 
corollary  of  Theorem  2.6.1  the  following 

Corollary  2.6.3 

With  the  same  hypotheses  as  Theoem  2.6.1,  there  exists  c^  >  0 
such  that  IN(x)  -  N(x°) I  -  HL-1(x)  -  L_1(x°)  I  <  CqIx-x0!  for  all 
x  €  l(x°,e).  □ 

Theorem  2.6.4  (Banach  Perturbation  Lemma  115]) 

Let  A,C  c  HnXn  and  assume  A  is  Invertible  with  IA  *  I  <  a. 

If  IA-CI  <  p  and  ap  <  1,  then  C  is  also  Invertible  and 


’A*  « 


v;-v] 
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3 


1C-1 1  <  <x/(l-ap)  .  □ 

Some  Sparsity  Relations hips 
3.1.  Non-cancellation  Assumption 

The  updates  in  Section  2  are  defined  in  terms  of  the  sparsity 
patterns  of  L,N  and  U.  The  sparsity  patterns  of  these  factors  are 
generally  not  given  as  input  to  the  algorithm  but  instead  are  defined 
during  the  factorization  process.  The  sparsity  pattern  of  F'(x)  and 
is,  however  input  which  is  given  to  the  algorithms  of  Section  2;  it 
may  be  specified  by  the  user  or  predetermined  by  some  separate  sub¬ 
routine.  In  this  section,  some  results  concerning  the  sparsity  patterns 
of  the  factors  are  established.  But  first,  we  need  an  assumption  on  the 
way  these  patterns  are  defined  during  the  factorization  process. 

Consider  the  reduction  of  a  matrix  A  to  upper  triangular  form  by 
Gaussian  elimination.  Without  loss  of  generality,  assume  here  that 
P  ■  I  whre  P  is  the  permutation  matrix  representing  the  pivoting 
strategy,  i.e.,  A  ■  LU,  where  L  is  unit  lower  triangular  and  U  is 
upper  triangular  with  non-zero  diagonal  elements.  The  process  takes 
place  in  n  steps  as  follows: 


A(1)  -  rQA  where  rQ  -  I 

(1+1)  -  rtA(1)  for  i  -  1,  2,  ...,  n-1 

(3.1.1) 

where 
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and 


for  j  <  i 


i  T 

Fi  "  1  +  *  ei 


c1  £  *n,  cj  -  0 


-  U 


(3.1.2) 


Thus 


N  •  r.-l  r„-2  -  r0  ■  li  (I  +  !“'1  Vl» 


—  i  —i  n— 1  .  _  n— 1  ,  _ 

l  -  r/  r,1  r  -  n  (i-c1  e)»i-  l  r  e:  .  (3.1.3) 

i-l  i-l  1 


8 t 

At  the  (1+1)  stage  of  the  reduction,  the  elementary  matrix 
premultiplies  to  reduce  to  zero  all  elements  of  the  i1"*1  column 

of  A^*\  which  are  In  rows  r  for  r  >  1.  However,  unexpected 
zeroes  may  arise  through  cancellation  In  other  columns  in  positions 
where,  formerly,  there  were  non-zero  elements.  The  non-cancellation 
assumption  says  that,  in  defining  the  sparsity  pattern  of  A^i+*^  at 
the  (i+1)  stage,  accidental  zeroes  which  arise  are  treated  as  small 
non-zero  elements.  No  advantage  is  taken  of  unexpected  zeros  to  reduce 
the  density  of  the  matrix  A^i+^  and  of  subsequent  matrices  A^^ 
for  1  >  i+1.  This  is  equivalent  to  defining  the  sparsity  patterns  of 
L  and  U  as  the  sparsity  pattern  of  LU  factors  of  maximum  density 
that  can  be  generated  from  all  matrices  that  have  the  same  sparsity 
pattern  as  A. 
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m.M  jjjJmdZ*  Jk 


Kt— pie  3.1.1 


The  symbol  6  Is  used  In  the  above  example  to  denote  zero  elements 
which  are  treated  as  non-zeroes  for  the  sake  of  defining  the  sparsity 
patterns  of  the  factors.  The  acceptable  sparsity  pattern  for  U  In  the 
above  example  Is  given  by 
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here  *  denotes  a  non-zero  element. 

The  non-cancellation  assumption  Is  also  extended  to  the  generation 
£  the  sparsity  pattern  of  N.  We  assume  that  no  cancellations  occur  In 
he  generation  of  N  according  to  (3.1.3),  i.e.,  the  sparsity  pattern 
f  N  has  the  maximum  density  that  can  be  obtained  from  all  possible 
actors  having  the  same  sparsity  patterns  as  for  1  *  1,  2,  ..., 

-1. 

x— pie  3.1.2 


1  0 


1  0 


0  0 


n  -  r2rl 


-l  o 


The  sparsity  pattern  of  N  is  taken  to  be 


*  0 


0  0 


0  0 


*  0 


*  0 


0  0 


and  not 


0  0 


Here  was  set  to  zero  by  an  unexpected  cancellation  while 
multiplying  the  factors  1^,  so  we  shall  consider  ^  0  in 
defining  the  sparsity  pattern  of  N. 


3.2.  Sparsity  Results 


\  =  Span{zJ\  zj} 


Xt  =  SpanfzJ,  zj} 


The  following  theorem  establishes  the  basic  relationships  in  the 
sparsity  patterns  which  we  shall  need. 


S'  * 

*  V  ‘.v  V  v  .* 


•'  **  •  ■  *  •  *  % "  •  *  «  *  .  '  ■  •  .  *  •  •* .  •'»  »*,  - "»  »*  i',  .' 


4.2.1 


If  lU^  -  I  26,  then  for  6  small  enough  there  exists  a 

»  >  0  such  that  plSYi(w(i))l  >  «SY±(s) I. 

'roof: 

T  i(w(i))  -  [T  i(U(i))]  [T  i(s)  ]  ,  by  Lemma  3.2.2 

T  1(s)  -  [T  1(U(1)) f1  [T  1(u(1)) ]  .  (4.2.1) 

Now  |U(i)-U*(i) I  <  26  implies  «TYl(U(i))-TYi(U*(i) ) J 

<  26  and  we  know  that  TYi(U*^)  is  nonsingular.  Therefore  by  the 
Banach  Perburbation  Lemma,  for  6  small  enough  and  IU^^-U*^  I 

<  26,  there  exists  >  0  such  that  [TY*(U^)]  *  exists  and 
0  <  .|TYi(U<1))]*1.  <  p  .  Substituting  into  (4.2.1),  we  get 

Y  Y  Y 

IT  A(s)l  <  l[T  1(U(i))f1l  •  IT  i(w(1))l 

Y 

_<  pIT  (wv  J)t  where  p  -  max{p.  )  . 

i 

The  required  result  now  follows  by  noting  that  for  any  v  e  Kn, 

IT^(v)l  ■  ISA(v)l  for  any  subspace  A  c  Hn.  0 

Denote  4*  =  (LJj,  •••>  L*  j.j.  UJ  i»  and  let 

x  *  x  +  8. 
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-*  Zjfi. 


A*. 


for  some  p  >  0  . 


•i  Iv-ul  <  IF(v)  -  F(u) I  <  plv-ul 

tie  following  lemma  gives  a  result  analogous  to  (4.1.1),  but  obtains  a 
Lghter  bound  by  exploiting  the  sparsity  pattern. 

BW  4.1.1 

There  exist  e  >  0  such  that  if  o(u,v)  <  e,  then 

Zi 

|F1(v)  -  F^(u)  -  FJ(x*)(v-u)|  £  <o(u,v)  IS  (v-u)  II  for  some  k  >  0 

roof:  Let  y  ”  F(v)  -  F(u)  and  s  ■  v  -  u. 

|y±  -  F'(x*)s|  -  |(FJ(u+ts)  -  FJ(x*)]s| 

for  some  0  _<  t  _<  1  by  the  mean  value  theorem 

zA 

-  l[F^(u+ts)  -  F|(x*)]  (S  1(s))l  since  ZA  *  zj'(x) 

Zi 

<  tc^lu+ts~x*l  IS  (s)l  by  assumption  (c) 

where  k  *  max{tc.  }  . 
i 

.2  Convergence  Inoolta 

Denote  A*  •  F'(x*)  *  L*  U*.  We  assume,  without  loss  of 
enerallty,  that  the  pivoting  strategy  at  x*  is  the  identity.  The 
ollowlng  lemma  uses  notation  Introduced  in  Lemma  3.2.2. 


n  {v  e  Rn  :  vTu^  -  y^  . 


Proof  of  Theorem  2.4 

We  verify  easily  that  x^  e  and  x ^  “0.  But  these 

imply  the  feasibility  of  L,  U.  Optimality  and  uniqueness  now 
follow  as  in  the  proof  of  Theorem  2.2. 

4.  Convergence  Analysis  of  Algorithm  I 
4.1  Properties  of  Function  F(«) 

Let  I  *  I  denote  the  vector  norm  or  the  Frobenius  matrix 

norm.  Assume  F  satisfies  the  following  conditions: 

(a)  F  :  Rn  -*■  Hn  is  continuously  differentiable  on  an  open 
convex  set  Q  c  . 

(b)  There  exists  x*  e  Q  such  that  F(x*>  -  0  and  F'(x*)  is 
nonsingular. 

(c)  There  exists  k  >  0  such  that  IF'(x)  -  F'(x*)l  <,  k!x-x*1  for 
x  e  Q.  Or,  equivalently,  we  have  that  there  exist  >  0  for 
i  ■  1,  2,  ...,  n  such  that  IF|(x)  -  F^(x)  I  k^Ix-xH  for 
all  x,x  €  Q. 

Denote  a(u,  v)  =  max{lu-x*I,  lv-x*l}.  By  Lemma  3.1  of  Broyden  et 
al.  (1973)  we  then  have  that  there  exists  e  >  0  such  that  if 
cr(u,v)  <  e,  then 

IF(v)  -  F(u)  -  F’(x*)(v-u)l  <  ko(u,v)  Iv-ul  (4.1.1) 

and 
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where  ¥?i(u(1)  is  nonsingular  since  it  is  an  upper  triangular 
matrix  with  non-zero  diagonal  elements.  Therefore  T^Cs)  -  0  which 
implies  s*l(.)  .  0.  Since  c  by  Lemna  3.2.1(a),  we  get 
S  *(s)  -  0.  Now 


yi  "  Fi^*  +  8>  “  F±(x) 


e^  F'(x  +  ts)s 


-  e*  F'(x  +  ts)[S  1(s)] 


for  some  0  _<  t  _<  1  by  mean 
value  theorem 

since  A  and  F'(x)  have  same 
sparsity  pattern 

Zi 

since  S  (s)  ■  0  . 


Bence  «  0  -  y^  and  is  feasible. 

We  verify  the  optimality  of  ^  by  considering  any  other  feasible 
vector 


1*4  - 


♦Al  -  l[(s  1  U(1>)T  (s  1(u(1)))f  (y±  -  U)(i))  (s  1(coCi))  )TH 

Y  Y  Y  Y 

r,  i  (i)  .T  ,  li  (i)  *  ,  i  .  ,  i 

-  I [ ( S  <u>  ))  (S  (u  ))f  (4>1-01)  (S  (u<i)))  (S  (oKi)))1! 


~  (i) 

since  ^  w  ■  y^ 


<  1^  ~  <^1  . 


Uniqueness  of  the  solution  follows  from  the  convexity  of  the 
feasible  region 
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v.v.T'r.'Lvr 


ill  ii  |  jlfi 


T?r'!'1  .  f  L 


since  «  f  0.  But  A^  is  non-singular  since  A  is  LU 
factorizable  without  pivoting*  This  is  a  contradiction.  Therefore 
T^A^)  oust  be  non-singular. 

3.3  Proof  of  Previous  Le— as 

Now  that  the  sparsity  patterns  of  L,  N  and  V  have  been  clearly 
specified,  we  proceed  to  verify  the  assertions  in  Theorems  2.1  and  2.4. 

Proof  of  Theorem  2.1 

We  first  check  feasibility  of  the  stated  solution.  Since  ^  e  Y, 
and  SYi(u(1))  €  Y± ,  then  ^  by  (2.2.7).  If  SYi(u(i))  +  0, 

then  from  (2.2.7) 

\  u(1)  -  ^  u)(i)  +  [(S  1(u(1)))T  (S  1(to(1)))]  (y±  -  w(1)) 

(sYl(u(i,))T  (s*1  J‘>) 

-  +  (y±  -  W(i)) 

’  *i  ‘ 

If  SY*(w^)  «  0,  then  ^  ^  and  ^  ■  0.  This  means 

that  is  feasible  only  if  y^  ■  0.  But  SY*(m^)  *  0  implies 

TYi(u(1))  -  0.  Thus 

0  -  T  i(u(i))  -  T  i(U(i)s)  -  IT  1(U(i))J  [T  i(s) ] 
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I 


•f . 


■  -•A- 


Proof: 


s  4(A(1)S)  -  [s  i(A(1))]« 

-  [sXl(A(1)) ]. 

X  X 

since  S  i(A^1^ )  -  S  1(A^1^)  by  Lemma  3.2.1(b) 

X.  t4\  X4 

-  [S  i(A(1))]  [S  i(s) ]  . 

It  now  follows  trivially  that  TXl(A(i)s)  -  [TXi(A<i)> J 
|TXi<s)l.  Now  let  x(xA)  “  {ij.  •••*  lm}  assume  T*i(A<1)> 

is  singular.  Then  there  exists  awe  ®.m  such  that  w  4  0  and 

VT  -  wT[T  1(A(i))  ]  -  0 

Define  w  e  Hn  by 

,  wf  if  k  e  xUj)  and  k  -  ir 

Wk  "  I 

'  0  otherwise  . 

Let  v  ■  w*A^.  If  8  e  x(X^)»  8  "  then 


and  if  i  {  x(xA).  then  by  Lemma  3.2.1(b),  a£^  -  0  for  all 

m  /l  \ 

r  €  x(*. )  *nd  hence  v  *  0.  Therefore  v  ■  w  A'  *0  where  w  f  0 
A  i  8 


Now  eliminating  the  zero  rows  and  columns  j  for  j  £  ^  we  get 
TYi(u(i)s)  -  [T  i(U(i))]  [T  i(s) ]  . 


3.2.3 


Vi-l,l  *1-1, 1-1 


i“l»n  * 


SXl(A(i)s)  -  [S  1(A(1))]  [S  1(s)]  , 


T  1(A(1)s)  -  [T  1(A(i))]  [T  i(s)]  . 


Moreover  S*i(*(1>)  is  non-singular. 


•-  ■  •  •  >  •  «-V"  v-  -  «  ,  «  C.  ■  .  ■  .  -  .  -  .  -  .  . 


3.2.2 


If 


U(1)  - 


U11  °12 


U 


l,n 


*  *  *  Ui-l,n 


O 


O 


then 


and 


S  V^a)  -  [S  i07(i)>  ]  [S  i(s}  ]  , 


T  i(u(i)s)  -  [T  i(U(1)> ]  [T  i(s)]  . 


Proof: 


I  I 

S  i(u(l)s)  -  [S  i(U(i))]  s  by  definition 

-  [s'W1’)]  . 


Y  Y 

since  S  i(U(i))  -  S  1  U(i)  by  Learn a  3.2.1(a) 

-  [sYl(u(i>)]  [sllc.)]  . 
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.^v.r,w.v: 


•*  _ •  _*  .■  _•  '  k  *  |  .  •  .  w 


the  projection  operator  which  seta  to  zero  those  elements  of  the  matrix 
with  row  index  i  for  i  $  x(A)»  i*e.,  for  M  e  Rn*n, 


[SA(M)  ] 


if  a  ^  x(A) 


ap 


M  „  ,  otherwise 

ap 


Similarly  define  SA  :  RnXn  ♦  RnXn 


[SA(M)  ] 


ap 


if  a  )•  x(A)  or  p  Jr  x(A) 


M 


dp  » 


otherwise 


We  also  define  the  associated  collapsing  operators  TA,  TA.  If 


X(A)  -  {iv  i2,  ...»  iffl} 

where  m  *  rank  A  and  i.<i_  <•••<! 

i  z  m 

then  define  TA  :  Rn  -*•  by 


T  (v)  -  0^  ,  v  ,  v±  )  where  v  •  (Vj,  v2>  ...»  vn> 

12  m 


and  T4  =  Rn*n 


„mxm  , 

R  by 


ta(m) 


M  •  •  •  M 

l  i  i  i 

V  m 


M  •  •  •  M 

i  ,i,  i  »i 

m  l  m  m 
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A(p+1)  _LP 

-  Span{Z^  :  j  €  x(z^  )  u  U,p}} 

.(p+1)  Lp 

-  Span {z *  :  j  e  x(Z±  )}  . 


Therefore  (c)  is  true  for  k  ■  p. 
Similarly 


_  «P  A<P+D 

xj  -  Span {2 J  ,  Z*  } 


-  SpanJX^'0,  X<p-1)} 


a  mP“*  nP  % 

Span{z^  !  j  €  X(z“  )  U  X(Z±  )} 

Span{zJ  ;  j  €  x(zf)} 


by  induction  hypothesis 


which  verifies  (d)  for  k  -  p. 

Now  deduce  by  induction  that  (c)  and  (d)  are  true  for  k  ■  0,  1, 
. . . ,  n-1  and  hence  (a)  and  (b)  are  true«  □ 


We  now  prove  some  sparsity  results  which  will  be  used  later  in  the 


convergence  analyses  of  Sections  4  and  5. 


Notation 


We  already  have  S^  :  Rn  •*  Rn,  for  some  subspace  A,  defined 
as  the  projection  operator  into  the  subspace  A.  This  notation  is 
extended  to  matrices  as  follows:  Let  :  Rn  n  *  B.nXn  be 


Caae  (Hi):  i  >  p  and  e  e 

A<P> 

Since  e  e  Z.  we  have 
P  1 


_T  p  A(p+1), 

Span{Z^  ,  Zj  } 


Tp-1  A(p)  A(p) 

Span {Span (zj  ,  ep},  Span{Z*  ,  Zp  }  n  (x  :  xp  -  0}} 

,  p-1  . (p)  *(p) 

Span { Span f Z±  ,  ep},  Spanlz^  ,  z£  }} 

. P”1  *(p) 

Span {Span (Z^  ,  },  Span{ep,  Zp  }} 

A(p)  tP"1  ,  *(P> , 

Span { Span {ZJ  :  j  e  x(Zt  )}»  z£  } 


by  induction  hypothesis  and  ep  <=  Zp 


A(p)  .rP'1  ,  *(P)  A(P), 

-  Span {Span {zj  ;  j  e  x<Zi  )},  Z^  >  z£  } 

A(P+1)  _rP  ,  f  A<P>  _A(p),, 

■  Span{Span{Zj  :  j  e  x(Z^  )}»  Span{z^  »  Z^  }} 


since  j  e  x(Zj  )  implies  j  <  p 

A(p+1)  _tP  ,a<p+1)  _A(p+1)^ 

Span{Span{Zj  :  j  e  x(ZA  )},  Spanfz^  »  Z p  }} 


>  v-v-i>v- v  . 


j.  •  .  •  a _.v.v.v 


Cue  (11):  1  >  p  and  ep  {  zj 


Since  e* 


we  have 


T. (p+1)  T.(p)  T^p  TMp-l  TTp  T.p-1 

e^A  r  -  e^A  r  ,  e^lr  -  e^lr  ,  e^I/  “  ei^ 


i<p+1)  .  7a(p)  -nP  .  jT1  7l*  .  .iT1 

L  Zi  ’  Z1  zi  zi  Zi 


YP  -  Span{ZLP,  zJ<P> } 

_tP-1  A(p\ 

-  Span {z^  ,  Z*  } 

,_a(p)  tP-1  . 

-  Span{Zy  j  j  e  )}  by  Induction  hypothesis 


a(p)  ,p-l  4(p) 

Span { {Zj  :  j  €  x(Zj[  )}»  Z*  } 

.(p+1)  p  .(p+1) 

Span{ {zy  :  j  «  x(zj  )},  zj  }  , 

-Lp_i  A( 

since  j  e  x(Z^  )  implies  j  <  a  and  Z^ 


k(P~l) 


-  Span{z^ 


(P+1)  LP 

:  j  «  X<Z"  )}  . 


Therefore  (c)  Is  true  for  1c  ■  p.  Similarly  (d)  is  true  for  k  ■  p. 


t 


yJ  -  -  Span{zj  :  j  £  x(zj  ) }  , 
xj  -  z\  -  Span {Zj  :  j  e  x(Z*  )}  . 


Thus  (c)  and  (d)  are  true  for  k  ■  0. 

Assume  (c)  and  (d)  are  true  for  k  ■  0,  1,  p-1  <  n. 

We  shall  consider  three  cases.  Note  that  £p  ■  0  for  a  <  p 
where  Is  defined  In  (3.1.2).  Therefore 


T.(p+1)  T.(p)  T„p  T.  p-1  T_p  Trp-1 

e  A  **  e  A  r  ,  e  N*^  ■  e  N  ,  e  Ir  ■  e  LF  ,  for  a  <  p 

a  a  a  a  a  a  — 


.(P+1)  .  /P)  ,NP  .  ZLP  .  ZLP 

i  a  ’  a  a  *  a  a 


Case  (1) :  1  £  p 


r-LP  _A(P+I)i 

Span{zJ*  ,  zj  } 

,_Lp-l  A<P), 

Span {z^  ,  Z*  } 

r  a<p>  LP_1  . 

Span{zj  :  j  e  xCzJ"  )} 

r  A<p+1)  LP  , 

Span fz*  :  j  e  x(zJ  )} 


for  a  p 


by  Induction  hypothesis 


since  j  e  X(Z^  )  implies  j  < 


Therefore  (c)  is  true  for  k  -  p.  Similarly  (d)  is  true  for  k  -  p. 


■V s.'-\ **  *.•  _V.‘ 


(a)  Y±  -  Span{Zj  :  j  £  X(zp}  , 

(b)  X±  -  Span{Zj  :  j  £  X(zJ)}  . 

Proof:  Let 

0  <  j  <  n-1 
0  <  j  <  n-1 

Then  Y  -  yJ_1  and  X±  -  X^_1. 

We  shall  show  by  Induction  that 

,  .(kH)  .k 

(c)  Yj  -  Span{Zj  :  j  €  X(Z“  )}  ,  k 

and 

(d)  -  Span{zJ  :  j  £  X(Z^}  ,  k 

The  statement  of  the  theorem  Is  (c)  and  (d)  for  k 


N' 


(J  - 


r  r  •  •  •  tv  . 
j  j-i  o  » 


-  r"1  r"1  ...  r"1  , 
i  -lj  a(j+1) 

q  -  SpantzJ  ,  Z*  ] 

1  -NJ  A(J+1) 

q  -  Span{z“  ,  Z*  ] 


Consider  the  case  k  ■  0 


There  exist  e  >  0,  6  >  0  such  that  if  o(x,x)  <  e  and 
IU^  -  I  <  26,  then  there  exists  a  p  >  0  (depending  on  6) 

such  that 


1^  -  <  14  -  4* I2  +  2p2 [( ko)2  +  IIL*^  IUV1;  -  U*'ll'V] 


2  .r,(i)  _  „*(i).2 


for  1  ■  1,  2,  . n.  Here  L*  denotes  the  1  row  of  L*  and 
a  ■  o(x,x). 


Proof:  Let  ■  S^*(c/^).  Then  from  (2.2.6)  we  obtain 


A(i)  A(i)T 

o*  -  (4±  -  ♦})  [i  -  va)  ] +  (y± 


(!)  _A_(i)T 


(i)T  «(i) 


-  *<i>, 

^  }  w(i)T  w(i)  ’ 


(4.2.2) 


1(4.  -  4*)  w(i)I2 


(4.2.3) 


,wi  i  '  *(i)T  *(i)’ 
w  w 


•l<»i  -  Ais)  +  <♦?  «*  •  -  n »  »  -(DT— (»■ 

w  w 


where  A*  is  the  z  row  of  A* 


ly  -  A*s I  IL*(U(i)  -  U*(i))sl 
s  - i - i -  +  _ _ _ 

lwvl'l  Iw^'l 


(4.2.4) 


Now  using  Lemmas  4.1  and  4.2.1  we  get 


-'ll  .-fill  <  ZSiZiSl  .1?  <  KpJ  8lnce  zAcl  (4.2.5) 

i»(1) .  p-'.s1^). 

and  using  Lemma  4.2.1,  we  get 

Y  Y 

JL*(U(1)  -  U*(1))sl  BL* I  IS  1(U(1)  -  U*(1))l  IS  ±(s)l 
lw(1)l  “  p_1ISYi(8)l 


_<  pIL*  I  IU(1)  -  U*(1)  I 


(4.2.6) 


Substituting  (4.2.5)  and  (4.2.6)  into  (4.2.4)  and  then  substituting  this 
result  with  (4.2.3)  into  (4.2.2)  we  get 


;  -*  j*  -•  ‘j.  *  «  V.  * 


‘  -*  V  '.*  '  »  v  V  *  •  •  •  *  -  *  *  •  •  v-  •  *  *  k%  /■ 


2 


_  k  *  .  '  ^  *  . 


2  ,  ;(1)  a(1)T,  2  (i)  i(1)T  2 

i*  -  4>*iz  -  K*  -  **)  (i  -  £ — - — )r  +  «(y  -  <o*  wu;)  — - - r 

1  1  *(i)  -Ci)  *(i)  *(i) 

w  w  w  w 

by  orthogonality  of  vectors  on  right  hand  side  of  (4.2.2) 


<  1^  ~  <t>J)lZ  +  [icpo  +  p!L*l  BU^1;  -  u*(i)  »]2 

<  1^  “  4>*)«2  +  2p2[(<0)2  +  IL*I!2  «U(1)  -  U*(1)l2] 


since  Ix+y  I2  <  2  lx  l2  +  2  ly  I2  .  □ 


The  next  lemma  converts  this  result  Into  a  convenient  form. 


4.2.3. 


There  exist  6  >  0  and  e  >  0  such  that  if  o(x,x)  <  e  and 
|U^  -  U*^l  <  26  then  there  exist  constants  >  0  (depend¬ 

ing  on  6)  such  that 


»D1  —  D*l  <  k1ID1  -  D*»  +  k2o  ,  for  1  <  1  <  n 


where 


and  D* 


**»**«  ^  •fa  <T  ,k,  •*.  »*.  «*«  .  •*.  **»  •'  *1%,  v **.  *’  •  •*  .  **.  ■1^  *'T* 


Proof:  IU(1)  -  <  26  implies  IU(J)  -  <  26 

for  j  -  1,  2,  ....  1.  Hence  by  Lemma  4.2.3, 

'♦j  -  -  4>J*2  +  2p2< 2 a2  +  2 IL*  I2  p2»U(J)  -  U*(J)12  , 


for  j  “  1,  2,  ...»  i 


(4.2.7) 


*i '  J,  r*j  '  »*'2  ■  ,5i  -  °v2 


Noting  that  IU^  -  U*^l2  <  R,  j,  we  get  from  (4.2.7), 


Rj  "  RJ-1  -  r J  +  k  Rj~l 


(4.2.8) 


where  r q  m  0  and  r^  «  -  <^*l2  +  2p2tc2o2  for  j  >  0  and  k  >  2lL*l2p2 


for  j  >  1. 


Now  iterating  (4.2.8),  we  obtain 


Rj  <  r j  +  ( 1+k)  Rj.j 


<  l  (l+k)j  r 
j-0  1  3 


<K  I  r  , 

j-1  3 


where  K  >  max{(l+k)^} 

J 


That  is 
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where  kj  ■  K  and  k^  ■  2n  k-  K  . 

2  2  2 

Now  using  the  fact  that  a,b,c  0  and  a  <,  b  +c  imply  a  <  bfc, 
we  get  the  desired  result. 

Theorem  4.2.4  (Linear  Convergence) 

Given  m  a  fixed  positive  Integer  and  r  e  (0,1),  there  exists  e  >  0 

0  k 

such  that  if  lx  -  x*S  e  then  Algorithm  I  produces  a  sequence  x 

for  k  -  0,  1,  2,  ....  which  satisfies 

lx^+^  -  x*l  <  rlx^  -  x*l  ,  for  k  «  0,  1,  2,  ... 

Proof:  By  Theorems  2.6.1  and  2.6.2  we  can  choose  >  0  such  that  if 

lx-x*l  _<  e  then  there  exists  a  constant  Cq  >  0  such  that 

IL(x)  -  L*  +  U(x)  -  0*1  <  cQ  Rx-x*  II  (4.2.8) 

and  if  Pq  A(x)Qq  is  LU  factorlzable  for  some  pivoting  strategy 
(Pq,Qq),  then  Pq  A*Qq  is  LU  factorlzable.  Without  loss  of 
generality,  we  shall  assume  that  the  pivoting  strategy,  (Pq>Qq)>  is  the 
identity  throughout.  Now  choose  0  <  e  <  and  6  >  0  to  satisfy 


y(l+r)  [ice  +  26(6+y)]  £  r(l-r)  where  y  *  max{IL*l,  IU*I>  (4.2.9) 

1  -  k“ 

{[max(l,  kjJJ®  cQ  +  k2^T~-'~k  -  26  *  (4.2.10) 

Note  since  kj ,  k^  depend  on  6  we  choose  6  first  and  then  take 
e  small  enough  so  (4.2.9)  and  (4.2.10)  are  satisfied. 

Since 

k+1  *  k  *  /.kv-1  ks 

x  -  x*  -  x  -  x*  -  (A  )  F(x  ) 

-  (Ak)_1  {[F(xk)  -  F(x*)  -  F’(x*)  (xk-x*)J 
+  [Ak  -  F*(x*)]  (xk  -  x*)} 

we  have 

Ix1^1  -  x*l  <  l(Ak)_1 I  { IF(xk)  -  F(x*)  -  F’(x*)  (xk-x*)l 

+  IAk  -  F'(x*)l  »xk  -  x*l}  .  (4.2.11) 


Using  the  notation  F'(x)  ■  A(x)  ■  L(x)  U(x),  we  note  that  if 
lx-x*l  <  e  <  er  then  IL(x)  -  L*  +  U(x)  -  U*l  <  CQe  <  26  by  (4.2.8) 
and  (4.2.10).  Hence  KL(x)  -  L*l  <  26  and  IIU(x)  -  U*  I  <  26  and 
IL(x) I  <  IL(x)  -  L* I  +  IL*I  <  26+y.  Hence 


IA(x)  -  A*»  -  IL(x)  U(x)  -  L*  U*l 

<  IL(x) I  IU(x)  -  U*l  +  IL(x)  -  L*  I  HU*  I 

<  (26+y)6  +  6y  -  26(&fy)  .  (4.2.12) 

Now  using  Che  Banach  Perturbation  Lemma  (Theorem  2.6.4)  and 
26(6+y)  y(l+r)  <  1  from  (4.2.9),  we  obtain 

HA(x)]"1*  <  y(|^)  .  (4.2.13) 

We  now  use  a  triple  Induction  to  show  that  H(k)  is  true  for 
k  ■  0,  1,  2 .  where 

H(k)  =  {lxk+1  -  x*l  <  rlxk  -  x*l  and  »Dk  -  D* I  <  26} 

—  n  n  — - 

lr 

with  as  defined  in  Lemma  4.2.3  .  (4.2.14) 

First  we  show  H(0)  Is  true. 

By  (4.2.8)  and  (4.2.10),  »D°  -  D*»  <  26  and  from  (4.2.11)  and 

n  n  — 

(4.2.13) 

lx1_x* I  <  Y(-p~)  { IF(x°)  -  F(x*)  -  F'(x*)  (x°-x*)«  +  26(6fy)  »x°-x*l} 
<  {*£  +  26(6+y)}  tx^-x*l  using  (4.1.1) 

<.  rlx^-x*l  using  (4.2.9)  . 

Hence  H(0)  is  true. 
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•*. 


Now  assume  H(k)  is  true  for  k  ■  0,  1,  . prn-1 .  He  shall  first 
verify  that  H(pm)  is  true  and  then,  by  Induction,  that  H(k)  is  true 
for  k  -  pmfl,  port-2,  ...,  (p+l)m-l. 

UXl  |r 

Since  lx  -  x*l  ^  rlx  -  x*l  for  k  ■  0,  1,  . ..,  pm-1,  we  have 
that  lxpm  -  x* I  e .  Now  since  Apm  ■  A(xpm)  by  definition  of  m, 

we  can  show  H(pm)  is  true  in  exactly  the  same  way  as  was  done  for 
H(0)  above. 

Now  asume  H(k)  is  true  for  k  •  pm,  pnrt-1 ,  ...,  pnr+j-1  with  j  <  m. 
Denoting  q  ■  porfj-1,  we  first  show  ID^  -  I  ^  6  by  induction  on 
row  i. 

Note  that  by  induction  hypothesis  on  k  we  have 

IDk  -  D*l  <  26  ,  for  0  <  k  <  q  . 

n  n  —  —  — 

Hence 

IDk  -  D*l  <  26  ,  for  0  <  k  <  q  and  1  <  i  <  n  . 

Thus 

l(Uk)(1)  -  (U*)(i)l  <  26  for  0<k<q  and  l<i<n. 

(4.2.15) 

Therefore  by  Lemma  4.2.13 


IDq  -  D*l  <  kjIDj"1  -  D*l  +  k26 

<  kq  ^ID?®  -  Ml  +  k9(<J  +  k.a  .  +  •••  +  k?  p®  a  ) 

—  1  1  I  z  q  1  q-1  l  pm 

1  -  km 

<  [ [max(l ,  kj)]®  cQ  +  k2^  ~k|^3e 

<  26  by  (4.2.10)  . 

Now  asume  IDq  -  D*l  <  26  for  1  <  A  <  1.  Then  l(Uq)(1)  -  (U*)(i) I 
<  26  and  by  Lemma  4.2.3 

IDJ  "  DJI  <  ki,Di-1  "  +  k2° 

1  *  kT 

<  [ [max( 1 ,  kj)]m  Cq  +  k2(i  _"k~) ]e  bT  iteration 

£26  by  (4.2.10)  . 

Hence  by  induction  on  row  i  we  now  deduce  IDq  -  D* I  <  26.  Now  as 

n  n  — 

in  (4.2.12)  and  (4.2.13)  above,  we  deduce  that 

IAq  -  A*l  <  26(6+y)  and  *(Aq)_1  R  y(-jq:) 

since  ILq  -L*+Uq-U*l"  IDq  -  D*l.  Hence  from  (4.2.11) 

n  n 


Ix^1  -  x*l  <  Y(-p|)  {lF(xq>  -  F(x*)  -  F'(x*>  (xq-x*)l 

+  26(  6fy)  II  xq  -  x*t} 

<  Y(j^)  {ke  +  26(6+y)  }  |xq  -  x*l 
rlxq  -  x*l  from  (4.2.9) 

Therefore  H(q)  is  true  and  hence  by  induction  H(k)  is  true  for 
k  *  pm,  pnrt-1,  . ...  pmHn-1 .  Completing  the  induction  we  deduce  that 
H(k)  is  true  for  k  *  0,  1,  2,  ...  . 


Theorem  4.2.5 

k 

With  the  same  hypothesis  as  in  Theorem  4.2.4,  the  sequence  x 
k  *  0,  1,  2,  ...  is  m-step  Q-superlinearly  convergent  to  x*. 


Proof:  Setting  p  -  am  for  a  ■  0,  1,  2,  ...  we  have 


l(A>  ~  A2,(ll',fl  ~  *P)I  <  '*p  -  **|  *  0 
lx1*1  - 


as  p 


By  Theorem  3.1  of  Dennis  and  More  (1974),  this  implies 


Ix*^  -  x*l 
lx**  -  x*l 


0  as  p  ■>  ®  . 


for 
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Convergence  Antlyila  for  Algorithm  II 
Update  II  is  a  sparse  version  of  an  update  presented  in  Johnson  and 
itria  (1983).  There  it  was  demonstrated  that  if  a  certain  bounded- 
:erloratlon  property  was  satisfied  then  the  algorithm  is  linearly 
ivergent.  The  following  theorem  is  paraphrased  from  Johnson  and 
itria  (1983)  to  conform  with  our  notation. 

eorea  5.1.1 

If  F( •)  possesses  the  properties  of  Section  4.1  and  the  Update  II 
tlsf ies 

IUk+l  -  U*  +  Nk+1  -  N* I  <  [l  +  a1o(xk,  xk+1)  ]  »Uk  -  U*  +  Nk  -  N*l 

+  a^aix*',  xk+1)  (5.1.1) 

en  there  exist  e  -  c(r)  and  6  ■  6(r)  such  that  if 

lx^  -  x*l  ^  e 

»U°  -  U*  +  N°  -  N*l  <  6 

(N*^  Uk)  is  defined  by  Update  II  for  k  >  0  ,  (5.1.2) 

1c 

en  the  sequence  {x  }  defined  by  Algorithm  II  satisfies 

lxk+*  -  x*l  r  lxk  -  x*l  ,  for  k  ■  0,  1,  2,  ... 
reover,  IN*1 1,  IUk I ,  !(!?**)  *1  and  Ku*1)"  *■  I  are  uniformly  bounded.  G 
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Hence,  to  establish  linear  convergence.  It  would  be  sufficient  to 
verify  that  the  bounded-deterioration  property  (5.1.1)  Is  true  for  the 


sparse  Update  II. 


such  that  if  o(x,  x)  <  e  for 
F(x) ,  then 

IN*y  -  U*sl  <  kQ  o(x,  x)l  S  1  (s )  I 

where  N* ,  U*  are  the  1th  rows  of  N*,  U*  respectively. 


5.1.2 


There  exists  a  constant  kg  >  0 
snail  enough,  x  "  x+s  and  y  ”  F(x)  - 


Proof: 


IN*y  -  U*sl  -  IN*(y  -  A*sl 

“  '  I  n*  (y  "  A*s)l 
Jcr,  J  J 

where  r)  •  x(z^)  and  A*  is  the  j*"*1  row  of  A* 

<.  k.  I  ^  (y  -  A*)s I  where  k.  -  max  {|N*  j) 
j«t)  J  J  j  €  n  J 


<  k.  I  ko(x,  x)  IS  (s) I  by  Lemma  4.1.1 
j€T) 


<  k.  ko(x,  x)  l  IS  i(s)l 
1 
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• 


—  L  i 

jC  kj  ico(x,  x)  •  rank(Z  )  •  IS  (s)  I 

i 

where  At  -  Span{Z^  :  j  e  x(Z^) } 

-  Xi 

<  kQ  o(x,  x)  IS  (a) I 

where  kg  *  k^  <n  and  by  Theorem  3.2.1(b). 


5.2.2 

rhere  exist  constants  >  0,  p^  >  0  such  that  if  o(x,  x)  e 
z  small  enough,  then 


Pi 


< 


IS  i(s)l 

Xi  i 
IS  (v1)! 


<  P2  • 


for  1  <  i  <  n  . 


:  Let 


A<*> 


be  as  defined  in  Lemma  3.2.3. 


Then  using  Lemma 


,  we  have 


■  A*^^s  +  G*(x,  x,  s) 


(5.1.3) 


Gj(x,  x,  s) 


(  g j (x ,  x,  s)  ,  for  j  <  i-1 


(5.1.4) 


0  , 


otherwise 


|  gj  (x,  X,  s)|  <  KTj  o(x,  x)  IS  ^(s)l  .  (5.1.5) 
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From  (5.1.3) 


Xi  i  Xi  (1)  Xi  1  - 

T  1(v1)  -  T  1[A*U,s]  +  T  1[G1(x,  x,  s)] 


(5.1.6) 


>w 


X  XX 

T  i[A*(1)s]  -  [T  i(A*(i))]  [T  i(s) J  from  Lemma  3.2.3  (5.1.7) 


id 


IT  1[G1(x,  x,  s)l I 


<.  max{K  }  o(x,  x)  {  J  IT  i(s)l2}1^2  where  ti  -  x(zf) 

j  J  J«n 


£  ko(x,  x)  IT  (8)1  where  k  ■  n  max{ic.  } 

j  J 


and  Aa  -  Span {Z^  :  j  e  x(Z^)} 


j<  k  o(x,  x)  IT  ( s ) I  since  A^  -  X ^  by  Theorem  3.2.1(b) 

(5.1.8) 

ub8tituting  from  (5.1.7)  and  (5.1.8)  into  (5.1.6)  we  obtain 


XX  X 

IT  i(vi)l  <  [it  i(A*(1))l  +  k  a(x,  x)  ]  IT  i(s)  I 


i  Xi  i 

PjIT  (8)11  where  is  a  constant  chosen  greater 

X 

than  IT  A(A*(1)) I  +  ko 


i  i 

<  p. IT  (8)1  where  p.  ■  max{p.}  . 

i  1 


(5.1.9) 
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Now  let  -  Tr^A*'1').  By  Lenma  3.2.3,  Is  nonsingular.  Thus 
from  (5.1.7) 

tSs)  -  M^[T  V)]  -  M^1  T  1[Gi(x,  x,  s)] 

XX  X 

IT  i(s)l  <  IM"1!  IT  i(vi)»  +  IN”1 1  ko(x,  x)  IT  i(s)  I  using  (5.1.8) 

Hence 

X  X 

[1  -  IM”1 1  ko(x,  £)]  IT  i(s)l  <  in”1 1  IT  A( v1) I  . 

If  a  Is  small  enough  so  that  1  -  IM^*  I  ko(x,  x)  >  0,  then 

X  X 

IT  1(s)l  <  p2IT  i(vi)l  , 

where 

p„  -  max{ ImT1 1/(1  -  IM"1!  ko(x,  x))}  .  (5.1.10) 

1 

The  result  now  follows  from  (5.1.9)  and  (5.1.10)  by  noting  that  for 
any  v  €  Rn,  IS^(v)l  *  IT^(v)l  for  any  subspace  A.  □ 

Theorem  5.2.3 

Update  II  satisfies  the  bounded-deterioration  condition  (5.1.1)  of 


Theorem  5.1.1 


Proof:  We  show  the  result  for  successive  pairs  (N,U),  (N,  U) 
where  xf  x  ■  xfs  are  the  corresponding  successive  points  and 
y  -  F(x)  -  F(x). 

Denote 


D  -  N  -  N*  , 


E  -  U  -  U*  , 


at*  *  N*y  -  U*s , 


D  -  N  -  N* 


E  -  U  -  U* 


at  **  Ny  -  Us 


Then  from  (2.4.3) 


(Di  +  V 


<Di  +  Et) 


Since 


“l  ”  Niy  "  Ul8 


-  (D^y  -  E1s)  +  a* 


(D1  +  Et)  v  +  a* 


-  (D±  +  Et)  S  1(vi)  +  a* 


we  have,  for  »SX*( v1)  I  +  0, 


if  IS  i(v1) I  - 


X  X 

(Da  +  E^)  -  ct^S  i(vi)]T/IS  ^v*)!2  otherwise  . 


(5.1.11) 


(5.1.12) 


BSIT.B'WSVV.^VV 


!jv_v  .xir.r. 


i 

I 

i 

i 


i 


I 


(Dt  +  -  (Dt  +  E1)  -  [(D±  +  E±>  S  1(v1)  +  o^]  [s  i(vi)f/IS  1(vi)|2 


ViVi> 


(da  +  ea)  (i  --p-)  - 


a* 


ViVi 


i  T 
ViVi 


i  1 

where  ■  S  (v  )  . 


(5.1.13) 


Thus  for  IS^Cv1)!  +  0, 


ID. 


±  +  E±  +  a*(~P— )2I  -  l(D1  +  E1>  (I  -  -p)l 


vlvi  \  .2 


vivl 


vlvi 


KD.  +  E^v/ 

*I>1  E±  I - 2 - 


(5.1.14) 


lv^  II 


and  for  ISX*(  v1)  I  -  0, 


ID±  +  E±l  -  IDi  +  . 


(5.1.15) 


Let  W  e  Fn*n  satisfy 


(  5i  +  *1 

Wi  "  j 

(  Da  +  IA  +  a*  vj/(vjvt) 


if  Iv1!  -  0 


otherwise 


and  let 

%  *  {1  :  Iv1!  ^  0) 

Then 


r 


i 
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(5.1.16) 


I  a*r  1/2 

ID  +  E »  _<  IWI  +  [  l  — ] 

1  1  i£it  Iv^ 


But  from  (5.1.14)  and  (5.1.15) 


l(D  +  E  )  v  r 
IWI  -  ID  +  El  -  l  - = - =; - — 


ieit  Iv^l 


-  (1-9)  ID  +  El  , 


where 


ID  +  El  ,  0  <  0  <  1  (5 


and  from  Lemmas  5.2.1  and  5.2.2,  for  1  e  it, 


Iv 


—  Ai 

a*|  kn  o(x,x)  IS  (s)l 

<  - 5 - <  kg  P2  o(*»x) 


-i  .3 


(5 


*  P2  »S  (a)  I 

Substituting  from  (5.1.17)  and  (5.1.18)  into  (5.1.16),  we  get 


ID  +  El  <  /(1-0)  ID  +  El  +  p  <t(x,x)  where  p  ■  yto  kg  p2  (5 


.1.17) 


.1.18) 


.1.19) 


Thus  the  bounded-deterioration  property  (5.1.1)  is  satisfied  with 
0^  -  0  and  ■  p. 


.  *»--  H".  ..  ■ 


Theorem  5.2.4 

Algorithm  II  is  Q-superlinear ly  convergent. 

Proof:  The  strong  form  of  the  bounded-deterioration  condition  (5.1.19), 
with  a,  ■  0,  is  sufficient  to  ensure  Q-superlinear  convergence.  The 
proof  proceeds  among  the  same  lines  as  Theorem  3.5  of  Johnson  and 
.Austria  (1963)  and  will  be  omitted  here.  □ 

6.  Concluding  Remarks 

The  non-cancellation  assumption  is  essentially  a  non-degeneracy 
assumption,  similar  to  the  case  for  linear  programming.  Without  it. 
Algorithm  I  is  probably  still  convergent  since  periodic  restarts  are 
necessary  anyway.  However,  it  seems  unlikely  that  the  Q-superlinear 
convergence  of  Algorithm  II  will  be  preserved  if  we  underestimate  the 
density  of  N  and  U.  It  is  interesting  that  the  use  of  the  non- 
cancellation  assumption  eliminates  a  problem  encountered  in  the  update 
of  Dennis  and  Marwil  (1982).  There  it  was  necessary  to  forgo  updating 
any  row  for  which  the  norm  of  the  projection  of  the  previous  Newton 
step,  s,  onto  the  sparsity  pattern  of  that  row  was  too  small.  For 
Updates  I  and  II  we  were  able  to  demonstrate  that  these  projections 
never  get  too  small  relative  to  s. 

The  theoretical  justification  for  Update  II  appears  to  be  much 
stronger  than  for  Update  I,  since  Q-super linear  convergence  is  ensured 
without  the  need  for  periodic  recalculation  of  the  Jacobian  matrix  from 
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scratch.  However,  it  should  be  noted  that  this  desirable  property  is 
not  achieved  without  cost.  In  general,  N  will  be  less  sparse  than  L 
and  so  Algorithm  II  will  incur  a  greater  storage  cost  than  Algorithm  I. 
In  some  pathological  cases,  N  can  be  full  for  an  extremely  sparse  L, 
for  example,  if 


Fortunately,  for  sparse  problems  we  have  much  greater  leeway  in  choosing 
a  pivoting  strategy  than  for  full  matrices  —  see  Duff  (1977),  Reid 
(1971)  —  and  the  above  example  can  usually  be  avoided  even  when  A  is 
trldlagonal,  by  using  appropriate  row  and  column  permutations. 

It  would  be  useful  if  an  update  could  be  found  which  also  allowed 
changes  in  the  pivoting  strategy,  i.e.,  starting  with  (P,L,U)  where 
PA  “  LU  for  some  permutation  matrix,  P,  find  an  appropriate  updated 
triple  (P,L,U)  where  LUs  ■  Py.  This  deficiency  in  Updates  I 
and  II  would  seem  to  suggest  that  they  are  of  limited  value,  since  any 
attempt  at  global  application  must  quickly  fail.  However,  they  find  a 
useful  application  in  the  predictor-corrector  continuation  problem 
(Allgower  and  Georg  (1981))  for  which 


(i)  Many  Newton-type  problems  must  be  solved  so  it  is  desirable 
to  use  an  inexpensive  quasi-Newton  method. 

(ii)  The  level  of  difficulty  of  successive  Newton  problems  can  be 
adaptively  chosen  and,  hence,  use  of  Newton's  method  may  be 
an  expensive  overkill. 

(iii)  The  global  problem  is  naturally  broken  down  into  a  series  of 
local  problems,  each  of  which  can  be  solved  separately  using 
either  Algorithm  I  or  II,  thus  overcoming  the  problem  of 
having  a  fixed  pivoting  strategy  in  the  updates. 


CHAPTER  4 


Computational  Experience 

1.  Introduction 

Tae  Ideas  of  the  preceding  three  chapters  have  been  implemented  in 
a  Fortran  program.  Data  structures  appropriate  for  sparse  systems  are 
used  throughout.  For  example,  only  the  non-zero  elements  of  the 
Jacobian  matrix  M  are  stored,  and  these  are  maintained  in  three  arrays 
A  (double  precision),  INUM  (integer)  and  JNUM  (integer),  where 
M(I,J)  0  if  and  only  if  there  is  some  K  such  that  A(K)  ■  M(I,J), 
INUM(K)  ■  1  and  JNUM(K)  ■  J.  The  LU  factors  of  M  are  obtained  by 
Gaussian  elimination  using  a  threshold  pivoting  strategy  with  both  row 
and  column  permutations  allowed.  The  subroutine,  LU1FAC,  which  is  part 
of  the  LUSOL  package  (see  Gill,  et  al.  (1984))  was  used.  Obtaining  the 
NU  factors  explicitly  (where  N  ■  L  *)  consists  essentially  of 
Inverting  the  matrix  L  which  results  from  the  Gaussian  elimination. 

3 

This  is  a  quite  expensive  process  requiring  0(n  )  operations;  it  is 

2 

hoped  that  sparsity  will  reduce  this  to  0(n  )  operations.  The  Jacobian 
matrix,  M,  is  obtained  by  a  finite  difference  approximation;  the  graph 
coloring  heuristics  of  Coleman  and  More  (1981)  are  used  to  reduce  the 
number  of  function  evaluations  necessary  for  each  matrix  approximation. 

2.  Local  Comparison 

We  first  compare  the  local  behaviors  of  the  updates  discussed  in 
Chapter  3.  The  following  two  problem  types  are  taken  from  Broyden 
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(1971),  where  they  were  used  to  compare  the  behavior  of  the  sparse 
Broyden  method  with  Newton's  method. 

Type  Is  f1(x)  •  (3  -  +  1  ”  x^  “  2xi+1  »  1  <.  i  <  n 

_  i+r,  2 

Type  2:  f±(x)  ■  (kx  +  k2  *1)x1  +  1  -  k3  l  (x^  +  x^)  ,  1  <  i  <  n 

Jl*  1 

For  both  types  we  have  f  :  Rn  •*  Hn  and  x^  ■  0  for  j  <  1  or 
j  >  n.  The  Initial  estimate  of  the  solution  in  each  case  was  taken  to 
be  x® ,  where  x^  ■  -1  for  1  <  1  <  n.  The  Iterations  were  stopped 
at  X*  where  lx'*'  -  x'*'  *1  e  and  lx^  -  *  I  >  e  for  j  <  A  and  e 

given.  Tables  1,  2,  and  3  give  the  results  for  varying  dimension  and 
values  of  the  parameters  k^ ,  k^  and  ky  The  following  code  is  used  for 
type  of  iterative  technique: 

1  :  Newton's  Method 

2  :  Dennis/Marwil  Update 

3  :  Update  I  of  Chapter  3  for  LU  factors 

4  :  Update  II  of  Chapter  3  for  NU  factors 

Only  full  Newton  steps  were  taken.  All  three  updates  compared  favor¬ 
ably,  in  terms  of  the  number  of  function  evaluations,  with  respect  to 
Newton's  Method.  A  comparison  with  the  results  of  Broyden  (1971)  — 
where  a  slightly  different  stopping  criterion  is  used  —  shows  that 
these  updates  require  about  the  same  number  of  iterations  as  the  sparse 
Broyden  Update.  Here,  however,  we  have  eliminated  the  need  for  a  matrix 
factorisation  at  each  iterate. 


3.  The  Continuation  Problem 

The  following  seven  test  problems  were  used. 

Problems  1  and  2 
The  homotopy 

h(x,t)  -  f(x)  -  (1-t)  f(x°)  , 

is  traced  from  (x^,0)  to  (x,l).  Problems  1  and  2  correspond 
respectively  to  choosing  f(»)  from  Types  1  and  2  of  the  previous 
section. 

Problem  3  (Watson  (1979d)) 

We  solve  the  linear  complementary  problem  by  the  use  of 
Mangasarlan's  transformation  (Mangasarian  (1976)). 

We  have 

f  :  Rn  ■*  Rn 

f(x)  ■  Ax  +  q  ,  A  e  RnXm 

Aii  "  6’  Aij  “  "4  for  Ii-Jl  ”  1 

A^  ■  1  for  j  i-J  |  ■  2  and  A^  ■  0  for  |  i-j  |  >  2 
q  “  X(— 1 ,  0,  «...  0)  ,  \  ^  0 
We  wish  to  solve  the  following  problem: 


Define  g  :  *n  -*•  Rn  by 


g^x)  -  jf^x)  -  xi|3  -  (f1(x))3  -  x3  ,  1  <  i  <n 


and  h  :  Rn+1  -*•  Kn  by 


h(x,X)  -  -X  g(x)  +  (l-X)(x-xu)  . 


We  now  trace  the  path  {(x,X)  :  h(x,X)  *0}  from  (x  ,0)  to  (x,l) 
where  x  solves  the  linear  complementary  problem  (3.1)  (see  Watson 
(1979d)). 


Problem  4  (Kellog,  LI  and  Yorke  (1976) ) 


f  :  Rn  -*•  »n 


F  (x)  -  a.  +  b.  x  x  x 
i  i  ^  ®i  ^1 


where  0  <  a.pb.^  <  1  and 


^  ( 1 »  i  n  } 


The  data  a^,  b^,  <x^,  |3^,  y^  are  obtained  by  random  number 
generation.  A  fixed  point  of  f(»)  is  located  by  tracing  the  zero 
curve  of  the  homotopy 


h(x,t)  ■  x  -  t  f(x) 


from  the  point  (x  ,0)  ■  (0,0)  to  (x,l). 


Problem  5  (Saigal  (1981)) 


The  following  boundary-value  problem  is  solved  by  discretization 
u"  -  (2u  -  0. 5t  +  l)3 
u(0)  “  u(l)  =  0 

On  a  mesh  of  n  points  we  have  =  u(jh),  1  _<  j  _<  n  where  h  «•  l/(n+l) 
and 

fjCx)  =  (Xj  -  2Xj  +  x^)  -  2h2(Xj  -  4^  +  l)3  ■  0  ,  1  <  j  <  n  . 

We  solve  f(x)  -0  by  tracing  the  zero  curve  of  the  homotopy 

h(x,t)  *  tf(x)  +  (l-t)x 
from  (x^,0)  ■  (0,0)  to  (x,l). 

Problem  6 

As  in  problem  5,  we  solve  the  following  boundary  value  problem  by 
discretization  on  n  points 


u"  +  \eu  -  0  , 
u(0)  -  u(  1)  “  0  . 

This  is  an  example  of  a  parametric  problem  in  which  we  may  be  interested 
in  all  solutions  for  X  in  some  range. 
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Table  10:  Problem  7 


Type  of 
Update 

Dimension 

No.  of 

Predictor-Corrector 

Cycles 

No.  of  Final 
Iterates  at 
Level  t  *  1 

No.  of 
Function 
Evaluations 

11 

9 

13 

3 

438 

22 

9 

18 

4 

204 

33 

9 

12 

7 

165 

44 

9 

12 

7 

164 

41 

9 

13 

4 

264 

11 

39 

11 

3 

451 

22 

29 

18 

5 

291 

33 

39 

* 

* 

* 

31 

39 

13 

5 

289 

44 

39 

* 

* 

* 

41 

39 

13 

5 

289 

* 

denotes  failure 

of  the  algorithm  to 

converge . 

We  used  different  starting  points  for  the  thirty-nine 
dimensional  problem  and  the  nine-dimensional  problem  (see 
Watson  (1980b)). 


Table  9:  Problem  6 


Type  of 
Update 


Dimension 


No.  of  No.  of  Final  No.  of 

Predictor-Corrector  Iterates  at  Function 

Cycles  Level  t  -  1  Evaluations 


‘  V  •  '  *  •  *  •  *  ./♦*  .  *  v* 


v.  a  /v:  /.  .* 


Table  8:  Problem  5 


No.  of 

No.  of  Pinal 

No.  of 

Type  of 

Predictor-Corrector 

Iterates  at 

Function 

Update 

Dimension 

Cycles 

Level  t  ■  1 

Evaluations 

11 

100 

4 

3 

90 

22 

100 

4 

3 

50 

33 

100 

4 

3 

50 

44 

100 

4 

3 

50 

J.Y. 


' 


TT" 


Table  7:  Problem  4 


No.  of 

No.  of  Final 

No.  of 

Type  of 

Predictor-Corrector 

Iterates  at 

Function 

Update 

Dimension 

Cycles 

Level  t  ■  1 

Evaluations 

11 

50 

5 

3 

196 

22 

50 

5 

5 

90 

21 

50 

5 

5 

135 

33 

50 

5 

4 

89 

31 

50 

5 

4 

134 

44 

50 

5 

4 

89 

41 

50 

5 

4 

134 

11 

100 

6 

3 

247 

22 

100 

6 

7 

107 

33 

100 

6 

4 

104 

44 

100 

6 

4 

104 

Table  6:  Problem  3 


Type  of 
Update 

Dimension 

No.  of 

Predictor-Corrector 

Cycles 

No.  of  Final 
Iterates  at 
level  t  ■  1 

No.  of 
Function 
Evaluations 

11 

10 

16 

4 

681 

22 

10 

33 

7 

510 

21 

10 

20 

6 

441 

33 

10 

* 

* 

* 

31 

10 

18 

9 

407 

44 

10 

26 

7 

440 

41 

10 

18 

11 

408 

*  denotes  failure  of  the  algorithm  to  converge 


Table  5:  Problem  2 


kl  “  k2  *  k3  “  I*0*  ri  “  r2  "  lf  n  "  50 

“  (-1,  -1,  ••• >  ”1) 


Type  of 

No.  of  Predictor- 

No.  of  Final 
Iterates 

No.  of 
Function 

Update 

Corrector  Cycles 

at  t  ■  1 

Evaluations 

11 

5 

4 

168 

22 

8 

6 

75 

44 

8 

7 

76 

40 

9 

5 

88 

41 

5 

6 

100 

111 


'  V  \'  ^ 


wSs 


‘17’ 


Table  4:  Problem  1 


kl 

-  1.0,  x°  -  (-1,  -1, 

•  • . ,  — 1 ) 

No.  of 

No.  of  Final 

No.  of 

Type  of 

Predictor-Corrector 

Iterates  at 

Function 

Update 

Dimension 

Cycles 

Level  t  1 

Evaluations 

11 

5 

3 

3 

145 

11 

20 

9 

4 

195 

11 

100 

30 

4 

270 

22 

100 

30 

4 

120 

33 

100 

30 

4 

120 

44 

100 

30 

4 

120 

41 

100 

30 

4 

175 

110 


.v.  •  -‘v*.’*  - 

./•.•■•/Lvt’A’A’.  •  - 


.'.V. 


•  ‘»h  ’  .  *  -**  »*"  4. 


-r. 


Table  3,  (continued) 


No.  of  No.  of 


No.  of 

Function 

No.  of 

Function 

Type  of 

Iterates 

Evaluations 

Steps 

Evaluations 

Update 

kl 

k2 

k3 

«o 

i 

o 

—< 

■ 

w 

e  -  10"6 

e  -  lO"10 

,n-10 
e  •  10 

1 

2 

2 

2 

5 

66 

6 

79 

2 

2 

2 

2 

12 

26 

20 

34 

3 

2 

2 

2 

10 

24 

20 

34 

4 

2 

2 

2 

10 

24 

16 

30 

1 

2 

3 

2 

5 

66 

6 

79 

2 

2 

3 

2 

11 

25 

20 

34 

3 

2 

3 

2 

11 

25 

21 

35 

4 

2 

3 

2 

10 

24 

21 

35 

1 

2 

4 

1 

5 

66 

6 

79 

2 

2 

4 

1 

12 

26 

22 

36 

3 

2 

4 

1 

16 

30 

31 

45 

4 

2 

4 

1 

16 

30 

42 

56 

1 

2 

5 

1 

6 

79 

6 

79 

2 

2 

5 

1 

13 

27 

24 

38 

3 

2 

5 

1 

22 

36 

36 

50 

4 

2 

5 

1 

22 

36 

48 

62 

1 

3 

4 

1 

6 

79 

6 

79 

2 

3 

4 

1 

14 

28 

27 

41 

3 

3 

4 

1 

18 

32 

38 

52 

4 

3 

4 

1 

22 

36 

45 

59 

1 

3 

5 

1 

6 

79 

7 

92 

2 

3 

5 

1 

16 

30 

31 

45 

3 

3 

5 

1 

20 

34 

42 

56 

4 

3 

5 

1 

24 

38 

57 

71 

109 


!  **-  *  .  •  .  *  .  •  .  *i“. 

A  .  .i  A  »  *^NJi  »j. 


t-’SVi." 

Table  3:  Type 

2,  dimension 

-  50 

rl  ■  r2  ’  5 

No.  of 

No.  of 

No.  of 

Function 

No.  of 

Function 

Type  of 

Iterates 

Evaluations 

Iterates 

Evaluations 

Update 

kl 

k2 

k3 

e  -  10"6 

e  -  10~6 

lft-10 
e  ■  10 

e  -  10“10 

1 

1 

1 

1 

4 

53 

5 

66 

2 

1 

1 

1 

8 

22 

17 

31 

3 

1 

1 

1 

8 

22 

13 

27 

4 

1 

1 

1 

7 

21 

13 

27 

1 

2 

1 

1 

5 

66 

6 

79 

2 

2 

1 

1 

10 

24 

18 

32 

3 

2 

1 

1 

9 

23 

16 

30 

4 

2 

1 

1 

9 

23 

16 

30 

1 

1 

2 

1 

5 

66 

6 

79 

2 

1 

2 

1 

8 

22 

29 

33 

3 

1 

2 

1 

10 

24 

19 

23 

4 

1 

2 

1 

9 

23 

17 

21 

1 

3 

2 

1 

5 

66 

6 

79 

2 

3 

2 

1 

23 

27 

39 

43 

3 

3 

2 

1 

13 

17 

26 

40 

4 

3 

2 

1 

14 

18 

26 

40 

|  1 

2 

3 

1 

5 

66 

6 

79 

2 

2 

3 

1 

11 

25 

20 

34 

3 

2 

3 

1 

13 

27 

28 

42 

4 

2 

3 

1 

16 

30 

26 

40 

1 

3 

3 

1 

5 

66 

6 

79 

2 

3 

3 

1 

12 

26 

55 

69 

3 

3 

3 

1 

16 

30 

39 

43 

4 

3 

3 

1 

15 

29 

32 

36 

1 

2 

2 

1 

5 

66 
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31 
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25 
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25 
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39 
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66 

6 

79 
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1 
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27 

26 

40 

3 

1 

2 

2 

12 

26 

20 

34 

4 

1 

2 

2 

10 

24 

25 

39 
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Table  2s  Type  2,  e  -  10 


-10 


kl 

‘  k2  '  k3  ‘ 

1.0 

No.  of 

Type  of 

No.  of 

Function 

Update 

Dimension 

rl 

r2 

Iterates 

Evaluations 

1 

50 

1 

1 

6 

31 

2 

50 

1 

1 

27 

32 

3 

50 

1 

1 

19 

24 

4 

50 

1 

1 

18 

23 

1 

50 

3 

3 

6 

57 

2 

50 

3 

3 

14 

23 

3 

50 

3 

3 

19 

28 

4 

50 

3 

3 

17 

26 

1 

50 

5 

1 

6 

55 
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50 

5 
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15 

24 

3 

50 

5 

1 

18 

27 

4 

50 

5 

1 

18 

27 
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Table  1:  Type  1,  e  ■  10 
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-10 


No.  of 


Type  of 

No.  of 

Function 

Update 

Dimension 

kl 

Iterates 

Evaluations 

1 

10 

0.5 

5 

26 

2 

10 

0.5 

13 

18 

3 

10 

0.5 

14 

19 

4 

10 

0.5 

14 

19 

1 

10 

2.0 

6 

31 

2 

10 

2.0 

16 

21 

3 

10 

2.0 

16 

21 

4 

10 

2.0 

17 

22 
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Efficient  implementation  of  such  a  strategy  requires  further  research 
into  the  monitoring  and  control  of  the  algorithm  and  was  not  attempted 
here.  We  have  demonstrated,  as  intended,  that  the  use  of  quasi-Newton 
methods  for  the  sparse  continu  -<on  problem  can  lead  to  an  increase  in 
efficiency  over  more  traditional  methods. 


Problem  7  (Watson  (1960b)) 

The  following  boundary  value  problem  arises  In  the  study  of  the 
motion  of  a  fluid  squeezed  between  two  parallel  plates. 

SCttF''*  +  3f”  +  mf'  f”  -  ff'")  -  f(4)  , 

f (0)  -  f"(0)  -  0.  for  (1)  -  1,  f'(l)  -  0  , 

m  *  0  (axisymmetric  case)  . 

We  discretize  on  p  points  to  obtain  an  n  dimensional  problem,  where 
n  ■  2p  +  1.  See  Watson  (1980b)  for  further  details. 

The  first  column  of  Tables  4-10  contains  a  two-digit  number.  The 
first  digit  Is  the  code  for  the  update  used  on  the  corrector  iterative 
sequence;  the  second  refers  to  the  updating  technique  used  at  the  end  of 
the  corrector  sequence  to  approximate  the  tangent  direction.  The 
tangent  approximation  technique  provided  some  favorable  results  (see 
Table  7),  but  In  general  proved  to  be  unreliable  (as  seen  in  Table  10). 

The  results  of  Tables  4-10  Illustrate  that  substantial  savings 
can  be  achieved  by  using  the  less  expensive  updating  techniques  rather 
than  Newton' 8  method.  As  with  other  quasi-Newton  methods  for  non-linear 
systems  the  extent  of  this  usefulness  depends  on  the  degree  of  non¬ 
linearity  of  the  equations,  with  quasi-Newton  methods  becoming  less  use¬ 
ful  the  more  highly  non-linear  the  system  is.  In  practice,  a  truly 
adaptive  continuation  strategy  would  allow  for  switching  between  quasi- 
Newton  and  Newton  methods  or  even  sparse  simpllcial  techniques. 


Broyden,  C.G.  (1971),  "The  convergence  of  an  algorithm  for  solving 
sparse  non-linear  systems,"  Math.  Comp.  25,  285-294. 

Broyden,  C.G. ,  Dennis,  J.E.  and  More,  J.J.  (1973),  "On  the  local  and 
super  linear  convergence  of  quasi-Newton  methods,"  J.  Inst.  Math. 
Appl.  12,  223-245. 

Chow,  S.N. ,  Mallet-Paret,  J.  and  Yorke,  J.A.  (1978),  “Finding  zeroes  of 
maps:  Homotopy  methods  that  are  constructive  with  probability 
one,"  Math.  Comp.  32,  887-899. 

Coleman,  T.  and  More,  J.  (1981),  "Estimation  of  sparse  Jacobian 

matrices  and  graph  coloring  problems Tech.  Report  ANL-81-39, 
Argonne  National  Laboratory. 

Crandall,  M.G.  and  Rabinowitz,  P.H.  (1971),  "Bifurcation  from  simple 
eigenvalues,"  J.  Func.  Anal.  8,  321-340. 

Curtis,  A.R. ,  Powell,  M.J.D.  and  Reid,  J.K.  (1974),  "On  the  estimation 
of  sparse  Jacobian  matrices,  "  J.  Inst.  Math.  Appl.  13,  117-119. 

Curtis,  A.R.  and  Reid,  J.K.  (1974),  ”  The  choice  of  steplength  when 
using  differences  to  approximate  Jacobian  matrices,”  J.  Inst. 

Math.  Appl.  13,  121-126. 

Davidenko,  D.  (1953),  "On  a  new  method  of  numerical  solution  of  systems 
of  nonlinear  equations,"  Doklady  Akad.  Nauk  USSR  88,  601-602  (in 
Russian) . 

Den  Heljer,  C.  and  Rheinboldt,  W.C.  (1981),  "On  steplength  algorithms 
for  a  class  of  continuation  methods,"  SIAM  J.  Numer.  Anal.  18, 


925-948 


Dennis,  J.E.  and  More,  J.J.  (1974),  "A  characterization  of  superlinear 
convergence  and  its  applications  to  quasi-Newton  methods,"  Math. 
Comp.  28,  549-560. 

Dennis,  J.E.  and  More,  J.J.  (1977),  "Quasi-Newton  methods,  motivation 
and  theory,"  SIAM  Rev.  19,  46-89. 

Dennis,  J.E.  and  Marwil,  E.S.  (1982),  "Direct  secant  updates  of  matrix 
factorizations,"  Math.  Comp.  38,  459-474. 

Dennis,  J.E.  and  Schnabel,  R.B.  (1979),  "Least  change  secant  updates  for 
quasi-Newton  methods,"  SIAM  Rev.  21,  443-459. 

Deuflhard,  P.  (1979),  "A  stepsize  control  for  continuation  methods  and 
its  special  applications  to  multiple  shooting  techniques,"  Numer. 
Math.  33,  115-146. 

Deuflhard,  P.  and  Heindl,  G.  (1979),  "Affine  invariant  convergence 

theorems  for  Newton“s  method  and  extensions  to  related  methods," 
SIAM  J.  Numer.  Anal.  16,  No.  1. 

Deuflhard,  P. ,  Peach,  H.J.  and  Rentrop,  P.  (1976),  "A  modified  continu¬ 
ation  method  for  numerical  solution  of  non-linear  two-point  bound¬ 
ary  value  problems  by  shooting  techniques,"  Numer.  Math.  26, 
327-343. 

Duff,  I.S.  (1977),  "  A  survey  of  sparse  matrix  research,"  Proc.  IEEE  65, 
500-535. 

Eaves,  B.C.  (1972),  "Homotopies  for  computation  of  fixed  points,”  Math. 
Prog.  3,  1-22. 

Eaves,  B.C.  (1976),  "A  short  course  in  solving  equations  with  PL 
homotopies,"  SIAM-AMS  Proc.  9,  73-143. 


Erlsman,  A.M.  and  Reid,  J.K.  (1974),  "Monitoring  the  stability  of  the 
triangular  factorization  of  a  sparse  matrix,"  Numer.  Math.  22, 
183-186. 

Garcia,  C.B.  and  Gould,  F.J.  (1978),  "A  theorem  on  homotopy  paths,”  Math 
of  Op.  Res.  3,  282-289. 

Garcia,  C.B.  and  Zangwlll,  W.X.  (1979),  "Finding  all  solutions  to 

polynomial  systems  and  other  systems  of  equations,"  Math.  Prog.  16, 
159-176. 

Garcia,  C.B.  and'  Zangwlll,  W.X.  (1980),  "The  flex  simpliclal  algorithm," 
Xn:  Numerical  Solution  of  Highly  Non-linear  Problems,  W.  Forster 
(ed.).  Nor th-Holland ,  71-92. 

Garcia,  C.B.  and  Zangwlll,  W.X.  (1981),  Pathways  to  Solutions,  Fixed 
Points  and  Equilibria,  Prentice  Hall. 

Georg,  K.  (1981a),  "Numerical  Integration  of  a  Davidenko  Equation,”  In: 
Numerical  Solution  of  Non-linear  Equations,  E.  Allgower,  K. 
Glashoff,  H.O.  Peltgen  (eds.),  Springer  Lecture  Notes  in  Math.  878, 
128-161. 

Georg,  K.  (1981b),  "On  tracing  an  implicitly  defined  curve  by  quasi- 
Newton  steps  and  calculating  bifurcation  by  local  perturbation," 
SIAM  J.  SSC  2,  35-50. 

Georg,  K.  (1981c),  "A  note  on  stepsize  control  for  numerical  curve 
following,"  Xn:  Homotopy  Methods  and  Global  Convergence,  B.C. 
Eaves,  F.J.  Gould,  H.O.  Peltgen  (eds.),  NATO  Conference  Series 

Gill,  P.E.,  Golub,  G.H. ,  Murray,  W.  and  Saunders,  M.A.  (1974),  "Methods 
for  modifying  matrix  factorizations,"  Math.  Comp.  28,  505-535. 


Gill,  P.E.,  Murray,  W. ,  Saunders,  M.A. ,  Wright,  M.H.  (1984),  "LUSOL:  A 
package  for  updating  LU  factors  of  a  sparse  matrix,"  presented  at 
the  Gatlinburg  9  Conference  on  Numerical  Linear  Algebra,  University 
of  Waterloo. 

Goldstein,  A.  (1967),  "Constructive  Real  Analysis,"  Harper  and  Row. 

Hackl,  J. ,  Wacker,  H.J.  and  Zulehner,  W.  (1980),  "An  efficient  steps! ze 
control  for  continuation  methods,”  BIT  20,  475-485. 

Haselgrove,  C.B.  (1961),  "The  solution  of  non-linear  equations  and 
differential  equations  with  two-point  boundary  conditions," 

Comput.  J.  4,  255-259. 

Hirsch,  M.W.  and  Smale,  S.  (1979),  "On  algorithms  for  solving  f(x)  ■ 

0,"  Comm.  Pure  and  Appl.  Math.  32,  381-312. 

Johnson,  G.W.  and  Austria,  N.H.  (1983),  "A  quasi-Newton  method  employing 
direct  secant  updates  of  matrix  factorizations,"  SIAM  J.  Numer. 
Anal.  20,  315-325. 

Jorgens,  H. ,  Peitgen,  H.  and  Saupe,  D.  (1980),  "Topological  perturba¬ 
tions  in  the  numerical  study  of  non-linear  eigenvalue  and  bifurca¬ 
tion  problems,"  In:  Proc.  Symposium  on  Analysis  and  Computation  of 
Fixed  Points,  S.M.  Robinson  (ed.).  Academic  Press,  139-181. 

Kearfott,  R.B.  (1981),  "A  derivative-free  arc  continuation  method  and  a 
bifurcation  technique,"  In:  Numerical  Solution  of  Non-linear 
Equations,  E.  Allgower,  K.  Glashoff,  H.0.  Peitgen  (eds.),  Springer 
Lecture  Notes  in  Math.  878,  182-198. 

Kearfott,  R.B.  (1983),  "Some  general  bifurcation  techniques,”  SIAM  J. 
Scl.  Stat.  Comp.  4,  No.  1,  52-68. 


121 


Keller,  H.B.  (1977),  "Numerical  solution  of  bifurcation  and  nonlinear 
eigenvalue  problems,"  In:  Applications  of  bifurcation  theory, 

P.H.  Rablnowltz  (ed.).  Academic  Press. 

Kellog,  R.G. ,  Li,  T.T.  and  Yorke,  J.A.  (1976),  "A  constructive  proof  of 
the  Brouwer  fixed  point  theorem  and  computational  results,"  SIAM 
J.  Numer.  Anal.  4,  473-483. 

Kubicek,  M.  (1976),  "Algorithm  502,  Dependence  of  solutions  of  non¬ 
linear  systems  on  a  parameter,  ”  ACM- TOMS  2,  98-107. 

Laasonen,  P.  (1970),  ”  An  Imbedding  method  of  iteration  with  global 
convergence,"  Computing  5,  253-358. 

Leder,  D.  (1970),  "Automatische  Schrittweitensteuerung  bel  global 
konvergenten  Einbettungsmethoden,"  ZAMM  54,  319-342. 

Lemke,  C.E.  and  Howson,  J.T.  (1964),  "Equilibrium  points  of  blmatrix 
games,"  SIAM  J.  Appl.  Math  12,  413-423. 

Li,  T.Y.  and  Yorke,  J.A.  (1979),  "Path  following  approach  for  solving 
nonlinear  equations:  homotopy,  continuous  Newton  and  projection,” 
In:  Functional  differential  equations  and  approximation  of  fixed 
points,  H.  Peitgen,  H.  Walther  (eds.).  Springer  Lecture  Notes  in 
Math.  730,  257-264. 

Li,  T.Y.  and  Yorke,  J.A.  (1980),  "  A  simple  reliable  numerical  algorithm 
for  following  homotopy  paths,"  In:  Analysis  and  Computation  of 
Fixed  Points,  S.M.  Robinson  (ed.).  Academic  Press,  New  York,  73-91. 

Mangasarlan,  O.L.  (1976),  "Equivalence  of  the  complementarity  problem  to 
a  system  of  nonlinear  equation,"  SIAM  J.  Appl.  Math  31,  No.  1. 

Marwll,  E.S.  (1978),  "Exploiting  sparsity  in  Newton-like  methods,” 

Ph.D.  Thesis,  Cornell  University,  Ithaca,  NY. 


>^^T^JssgCT»ggpiaaiagEiCT rv.f-' _«■ iTj.'^^gnyaMv,1.*.".  •■lm.-i n ■■>■■■■  ■:-» 


Marwil,  E.S.  (1984),  ’’Some  numerical  results  using  direct  secant  updates 
of  matrix  factorizations,"  preprint. 

Merrill,  O.H.  (1972),  "Applications  and  extensions  of  an  algorithm  that 
computes  fixed  points  of  a  certain  upper  semi-continuous  point  to 
set  mapping,"  Ph.D.  Thesis,  University  of  Michigan. 

Moore,  R.E.  (1978),  "A  computational  test  for  convergence  of  iterative 
methods  for  nonlinear  systems,”  SIAM  J.  Numer.  Anal.  15,  No.  6. 

Ortega,  J.M.  and  Rheinboldt,  U.C.  (1970),  "Iterative  solution  of 
nonlinear  equations  in  several  variables,"  Academic  Press. 

Peitgen,  H.O.  and  Prufer,  M.  (1979),  "The  Leray-Schauder  continuation 
method  is  a  constructive  element  in  the  numerical  study  of  non¬ 
linear  eigenvalue  and  bifurcation  problems,"  In:  Functional  Dif¬ 
ferential  Equations  and  Approximation  of  Fixed  Points,  H.O. 

Peitgen,  H.O.  Walther  (eds.)  Springer  Lecture  Notes  in  Math.  730, 
326-409. 

Peitgen,  H.O.  and  Schmitt,  K.  (1981),  "Positive  and  spurious  solutions 
of  nonlinear  eigenvalue  problems,"  In:  Numerical  Solution  of 
Nonlinear  Equations,  E.L.  Allgower,  K.  Glashoff,  H.O.  Peitgen 
(eds.)  Springer  Lecture  Notes  in  Math.  878,  257-324. 

Powell,  M.J.D.  (1970),  "A  hybrid  method  for  nonlinear  equations,"  In: 
Numerical  Methods  for  Nonlinear  Algebraic  Equations,  P.  Rabinowltz 
(ed.),  Gordon  and  Breach. 

Reid,  J.K.  (1971),  "  A  note  on  the  stability  of  Gaussian  Elimination," 

J.  Inst.  Math.  Appl.  8,  374-375. 


123 


■  i  ha 


'.''T'  ^  .'•T 


Rheinboldt,  W.C.  (1974),  “On  the  solution  of  large,  sparse  sets  of 
nonlinear  equations,"  Technical  Report  TR-324,  University  of 
Maryland,  Computer  Science  Center. 

Rheinboldt,  W.C.  (1977),  "Numerical  continuation  on  methods  for  finite 
element  applications,"  In:  Formulation  and  Algorithms  in  Finite 
Element  Analysis,  K.J.  Bahte,  J.T.  Oden,  W.  Wunderlich  (eds.),  MIT 
Press,  599-631 

Rheinboldt,  W.C.  (1977),  "An  adaptive  continuation  process  for  solving 
systems  on  nonlinear  equations,”  Polish  Academy  of  Sciences,  Banach 
Ctr.  Publ.  3,  129-142. 

Rheinboldt,  W.C.  (1978),  "Numerical  methods  for  a  class  of  finite  dimen¬ 
sional  bifurcation  problems,"  SIAM  J.  Numer.  Anal.  15,  1-11. 

Rheinboldt,  W.C.  (1980),  "Solution  fields  of  nonlinear  equations  and 
continuation  methods,"  SIAM  J.  Numer.  Anal.  17,  221-237 

Rheinboldt,  W.C.  (1981),  "Numerical  analysis  of  continuation  methods  for 
nonlinear  structural  problems,"  Computers  and  Structures,  103-113. 

Rosenberg,  A.  (1983),  "Numerical  solutions  of  systems  of  simultaneous 
polynomial  equations  using  continuous  homotopy  methods,"  Ph.D. 
Thesis,  Stanford  University,  California. 

Salgal,  R.  (1976),  "On  paths  generated  by  fixed  point  algorithms,"  Math 
of  Op.  Res.  1,  359-380. 

Salgal,  R.  (1977),  "On  the  convergence  rate  of  algorithms  for  solving 
equations  that  are  based  on  methods  of  complementary  pivoting," 
Math,  of  Op.  Ras.  2,  108-124. 

Salgal,  R.  (1981),  "A  homotopy  for  solving  large,  sparse  and  structured 
fixed  point  probleme,"  Preprint,  Northwestern  University. 

124 


Saigal,  R.  and  Todd,  M.J.  (1978),  "Efficient  acceleration  techniques  for 
fixed  point  algorithms,"  SIAM  J.  Numer.  Anal.  15,  997-1007. 

Sard,  A.  (1942),  "The  measure  of  the  critical  values  of  differential 
maps,"  Bull.  AMS  48,  882-890. 

Saupe,  D.  (1982),  "On  accelerating  PL  continuation  algorithms  by 
predictor  corrector  methods,"  Math.  Prog.  23,  87-110. 

Scarf,  H.E.  (1967),  "The  approximation  of  fixed  points  of  a  continuous 
mapping,”  SIAM  J.  Appl.  Math.  15,  1328-1343. 

Scarf,  H.E.  and  Hansen,  T.  (1973),  "Computation  of  economic  equilibria," 
Yale  University  Press,  New  Haven. 

Schmidt,  C.P.  (1979),  "Approximating  differential  equations  that 

describe  homotopy  paths,"  Tech.  Report  7931,  University  of  Santa 
Clara,  California. 

Schubert,  L.K.  (1970),  "Modification  of  a  quasi-Newton  method  for 

nonlinear  equations  with  a  sparse  Jacobian,"  Math.  Comp.  24,  27-30. 

Shamplne,  L.F.  and  Gordon,  M.K.  (1975),  Computer  Solution  of  Ordinary 
Differential  Equations:  The  Initial  Value  Problem,  W.H.  Freeman 
and  Company. 

Smale,  S.  (1976),  "A  convergent  process  of  price  adjustment  and  global 
Newton  Methods,"  Journal  of  Mathematical  Economics  3  (1976), 
107-120. 

Thapa,  M.  (1981),  "Optimization  of  unconstrained  functions  with  sparse 
Hessian  matrices,"  Ph.D.  Thesis,  Stanford  University,  California. 

Todd,  M.J.  (1976),  "The  computation  of  fixed  points  and  applications," 
Spalnger  Verlag,  Lecture  Notes  in  Econ.  and  Math.  Systems  124. 


Todd,  M.J.  (1980a),  "Exploiting  structure  In  plecewlse-1 Inear  homo copy 
algorithms  for  solving  equations."  Math.  Prog.  18,  233-247. 

Todd,  M.J.  (1980b),  "Traversing  large  pieces  of  linearity  in  algorithms 
that  solve  equations  by  following  piecewise-llnear  paths,"  Math,  of 
Op.  Res.  5,  242-257. 

Toint,  Ph.L.  (1977),  "On  sparse  and  symmetric  matrix  updating  subject  to 
a  linear  equation,"  Math.  Comp.  31,  No.  140,  954-961. 

Toint,  Ph.L.  (1978),  "Some  numerical  results  using  a  sparse  matrix 

updating  formula  in  unconstrained  optimization,”  Math.  Comp.  32, 

No. 143,  839-851. 

Wacker,  H.J.  (1978)  (ed.).  Continuation  Methods,  Academic  Press. 

Wacker,  H.J.,  Zarzer,  E.  and  Zulehner,  W.  (1978),  "Optimal  stepslze 

control  for  the  globalized  Newton  method,"  In:  Continuation 

Methods,  H.J.  Wacker  (ed.),  Academic  Press,  249-276. 

Watson,  L.T.  (1979a),  "An  algorithm  that  is  globally  convergent  with 

probability  one  for  a  class  of  nonlinear  two-part  boundary  value 

problems,"  SIAM  J.  tamer.  Anal.  16,  394-401. 

2 

Watson,  L.T.  (1979b),  "Fixed  points  of  C  maps,”  J.  Comput.  Appl. 

Math.  5,  131-140. 

Watson,  L.T.  (1979c),  "A  globally  convergent  algorithm  for  computing 

2 

fixed  points  of  C  maps,”  Appl.  Math.  Comput.  5,  297-311. 

Watson,  L.T.  (1979d),  "Solving  the  nonlinear  complementary  problem  by  a 
homotopy  method,"  SIAM  J.  Control  and  Optimization  17,  36-46. 
Watson,  L.T.  (1980a),  "Computational  experience  with  a  Chow-Yorke 
algorithm,"  Math.  Prog.  19,  92-101. 


Watson,  L.T.  (1980b),  "Solving  finite  difference  approximations  to 

non-linear  two  point  boundary  value  problems  by  a  homotopy  method, 
SIAM  J.  Sci.  Stat.  Comput.  1,  467-480. 

Watson,  L.T.  (1981),  "Engineering  applications  of  the  Chow-Yorke 
algorithm,"  Appl.  Math,  and  Comp.  9,  111-133. 

Wilkinson,  J.H.  (1965),  The  Algebraic  Eigenvalue  Problem,  Clarendon 
Press,  Oxford. 


security  CLASSIFICATION  OF  THIS  PAGE  (Wham  BMtoNNO 


REPORT  DOCUMENTATION  PAGE 


iwrTT-j^TJ’inM  i 


SOL  85-8 


«.  TITLE 

Sparse  Quasi-Newton  Methods  and  the 
Continuation  Problem 


READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


S.  RECIPIENT'S  CATALOO  NUMBER 


S.  TYPE  OF  REPORT  A  PERIOD  COVEREO 


Technical  Report 


S.  PERFORMING  ORO.  REPORT  NUMBER 


7.  AUTHORS 


B.  CONTRACT  OR  GRANT  NUMSERfa) 


Floyd  F.  Chadee 


N000 14-8  5-K-O  343 


S.  PERFORMING  organization  name  ANO  AOORESS 

Department  of  Operations  Research  -  SOL 
Stanford  University 
Stanford,  CA  94305 


II.  CONTROLLING  OFFICE  NAME  AND  AOORESS 

Office  of  Naval  Research  -  Dept,  of  the  Navy 
800  N.  Quincy  Street 
Arlington,  VA  22217 


It.  REPORT  OATS 

June  1985 


tS.  NUMBER  OF  PAOES 

127  pp. 


i 


MONITORING  AGENCY  NAME  *  AODRESSflf  dtttaaant  tram  Cankalll at  OlHem)  IS.  SECURITY  CLASS,  (at  Ml*  '•port; 

UNCLASSIFIED 

IS*.  OECLASSI PIC ATION/ DOWNGRADING 
SCHEDULE 


S.  DISTRIBUTION  STATEMENT  (at  Mia  Report; 


This  document  has  been  approved  for  public  release  and  sale; 
Its  distribution  Is  unlimited. 


17.  DISTRIBUTION  STATEMENT  (at  Mo  akatract  antaaad  In  Bloat  7*.  It  mttaaaa*  kam  Bap  at) 


IS.  KEY  WORDS  fCawMmu  «  taraaaa  alda  It  na 


and  IdanUtr  h y  klaak  i 


homotopy 

predictor-corrector 


SO.  ABSTRACT  (CmMnm  on  rcv«N  ifB  It  iwcaa 


-  tm4  Mm illy  by  block  m matt) 


PLEASE  SEE  ATTACHED. 


V  , 


>L  85-8:  Sparse  Quasi-Newton  Methods  and  the  Continuation  Problem, 
by  Floyd  F.  Chadee 


The  problem  of  tracing  a  smooth  path  arises  in  many  engineering 
oblems ,  the  solution  of  parametric  differential  equations  and  eigenvalue 
’obleme;  it  also  finds  application  in  the  solution  of  nonlinear  systems  of 
luations  by  homotopy  techniques .  In  many  instances,  the  path  is  defined 
iplicltly  as  the  solution  of  a  system  of  equations  whose  Jacobian  matrix 
large  and  sparse.  Robust  slmplicial  path-following  techniques  cannot  be 
plied  to  large  problems  since  the  work  involved  rises  rapidly  with 
ter easing  dimension.  This  dissertation  addresses  the  numerical  problems 
evolved  in  tracing  the  path  for  large  sparse  systems  by  the  use  of  a 
edictor-corrector  algorithm. 

The  corrector  phase  of  a  predictor-corrector  algorithm  is  very 
pensive  if  Newton's  method  is  used  as  the  corrector.  *  We  investigate  the 
te  of  sparse  quasi-Newton  techniques  to  reduce  ^thlsTfexpeuse1;-  In  order  to 
roid  the  drawbacks  of  the  sparse  Broyden  method^  -/  the  need  for  a  matrix 
ictorlzation  on  each  Iterate  and  the  need  to  store  both  the  Jacobian 
itrix  and  its  factors  ~^we  examine  techniques  for  directly  updating  the 
J  factors  of  the  approximation  to  the  Jacobian  matrix.  Under  reasonable 
isumptlons  on  the  systems  of  equations  to  be  solved,  a  proof  of  local 
-superlinear  convergence  is  presented  for  two  sparse  updating  techniques. 

A  predictor-corrector  algorithm  employing  these  sparse  updating 
schniques  is  Implemented  in  a  Fortran  code  and  numerical  results  are 
ttalned  demonstrating  the  advantages  to  be  gained  from  the  use  of 
las 1-Newton  methods  for  the  large  sparse  continuation  problem. 
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